Never Trust User Inputs -- And AI Isn't an Exception: A Security-First Approach (2024)

Table of Contents

Lack of security by design in AI tools The risks of relying on third-party LLMs The perils of inadequate datasets Emerging AI vulnerabilities Balancing innovation and risk How Tenable can help Rémy Marot FAQs References

As AI transforms industries, security remains critical. Discover the importance of a security-first approach in AI development, the risks of open-source tools, and how Tenable's solutions can help protect your systems.

Artificial Intelligence (AI) is transforming industries and starting to be massively adopted by software developers to build core business applications. However, as organizations embrace these advancements, it remains critical to ensure that the security of their users, their data or their underlying infrastructures are not compromised. According to a recent survey conducted by BairesDev, nearly 72% of the software engineers interviewed are leveraging generative artificial intelligence during their development work.

In the world of cybersecurity, one critical rule is “Never trust user inputs." This rule should be in the mind of every developer and should also be extended to AI technologies. AI systems, such as chatbots, act as intermediaries and process and generate outputs based on user inputs. These AI technologies, as an example, should also be treated as a new form of input and subject to the same level of scrutiny and security measures.

This blog post delves into key security concerns, emphasizing the need for a security-first approach.

Lack of security by design in AI tools

AI tools are, most of the time, open-source and ready-to-use software designed to be used locally on the developer’s machine. Many of these tools do not adhere to robust security practices by default, making them susceptible to exploitation. While analyzing some of the most common projects available on GitHub, we discovered that, for example, most of them do not offer any authentication by default, leaving it open to any user accessing it through the network through their embedded dashboards or the APIs provided. The presence of a web interface, API, and the ability to use them with the CLI increases their attack surface.

The exponential market interest in AI-related tools and applications has probably had a negative influence on their development, favoring the emergence of Proof-Of-Concept (POC) software, which is becoming very popular, rather than building battle-tested software.

The risks of relying on third-party LLMs

Large language models (LLMs) require substantial compute and storage resources, making it challenging for many organizations to deploy and maintain them on-premises. Consequently, it is often easier to rely on third-party providers to manage these resource-intensive models to avoid the hassle of managing the underlying infrastructure and focus on the business aspects. However, relying on such third-party services makes trusting these providers with potential critical business data difficult.

The critical risks related to such usage are real and should be handled on different levels :

Data breach on the provider side: As with any other service, all processed data could be compromised if the provider suffers a data breach. It is crucial to vet third-party providers and ensure they adhere to privacy and data protection policies.
Credential leakage: Accessing third-party services requires handling credentials and authentication data. As for any secret data, these credentials can be inadvertently leaked in different places such as public Source Code Management (SCM) software or web applications front end.
Model trustworthiness: Third-party services can provide numerous models to their customers, and it is critical to assess their reliability, safety, and adherence to ethical guidelines, as there is no actual guarantee that they are safe to use.

As organizations embrace these new technologies to enhance their business, they should ensure that their AI governance rules cover these risks.

The perils of inadequate datasets

More than other technology, AI is built to fully leverage the data that it consumes. One of its goals is to ensure that organizations take full advantage of the data and knowledge gained over the years to help them move quickly in their operating field, take appropriate actions and make decisions in a shorter period of time, with a high level of confidence and accuracy.

The dataset used to train the model should be seen as an input and should be carefully analyzed. Using confidential business information might inadvertently lead to a leak through model outputs, which can cause a significant security breach. Biased data can also result in AI software making unfair or harmful decisions.

A good approach to handle model security is to focus on security considerations based on confidentiality, integrity and availability. Some examples include:

Datasets should only include data that is safe for exposure to intended users. When possible, using data anonymization techniques can help safeguard sensitive information such as Personal Identifiable Information (PII) and decrease risks of failing to comply with laws comply laws and regulations.
Data collection processes should be properly implemented and monitored to ensure that data comes only from trusted sources, is accessible only to authorized users, and that the model uses the data and operates according to expectations over time.
Data availability is crucial for the model to be trained on a complete dataset that matches business requirements. Model availability is also a concern for applications that require usage in a synchronous way. The application fallback behavior should be carefully reviewed and tested like any other failure in classic developments.

Emerging AI vulnerabilities

LLMs introduce new classes of vulnerabilities that traditional security measures may not address properly. The most prevalent AI-related vulnerabilities are prompt injection attacks, model theft and training data poisoning.

Prompt injection attacks involve malicious users crafting inputs to manipulate LLMs into generating harmful or unauthorized outputs. Remember the “Never trust user inputs” cardinal rule? In this case, the LLM will act as a kind of intermediate between the user inputs and the system. This could result in the system producing sensitive information, executing malicious commands, or being an attack vector for other common vulnerabilities like Stored Cross-Site Scripting. As an example, Vanna.AI, a Python-based library designed to simplify SQL queries from natural language inputs, was recently identified as vulnerable to prompt injection attacks and leading to remote code execution on vulnerable systems.

Models should be protected in the same way we protect confidential and business critical data. The first part of this blog post described how easily some AI tools can expose data to unauthorized actors. Applying defense in-depth principles will help minimize intellectual property leakage if model theft occurs. Hardening model security with techniques such as encryption and obfuscation and having proper monitoring in place is crucial.

Finally, AI training data poisoning is a modern supply-chain attack. By altering the data used by the model, attackers can corrupt its behavior and elicit biased or harmful output, leading to direct impacts on the applications using it to achieve business goals.

As for other traditional fields, developers should always stay updated with the latest security guidelines and incorporate strategies from the OWASP Top 10 for LLMs. Techniques such as input validation, anomaly detection, and robust monitoring of the AI ecosystem's behavior can help detect and mitigate potential threats.

Balancing innovation and risk

AI technologies are promising and can transform many industries and businesses, offering innovation and efficiency opportunities. However, they represent a huge security challenge at many levels in organizations and this should not be overlooked.

By adopting a security-first approach, following best practices and having robust governance, organizations can harness the power of AI and mitigate the emerging threats related to its adoption.

How Tenable can help

Read more about how we help secure these tools:

Tenable Web App Scanning provides plugins to detect popular AI and LLM tools' web interfaces and vulnerabilities.
Tenable Vulnerability Management, Tenable Security Center, Tenable Nessus plugins detect popular AI and LLM tools.
Tenable Nessus Network Monitor plugins detect popular tools.
Tenable's researchers help elevate the ecosystem by identifying exposures in third-party AI software and disclose responsibly to the vendors. Among those that have been published are NextChat Server-Side Request Forgery / Cross-Site Scripting and SSRF Security Feature Bypass in Azure AI and ML Studios.

Rémy Marot

Rémy joined Tenable in 2020 as a Senior Research Engineer on the Web Application Scanning Content team. Over the past decade, he led the IT managed services team of a web hosting provider and was responsible for designing and building innovative security services in a Research & Development team. He also contributed to open source security softwares, helping organizations increase their security posture.

Interests outside of work: Rémy enjoys spending time with his family, cooking and traveling the world. Being passionate about offensive security, he enjoys doing ethical hacking in his spare time.

Never Trust User Inputs -- And AI Isn't an Exception: A Security-First Approach (2024)

FAQs

Why should we not trust AI? ›

Just like humans, AI systems can make mistakes. For example, a self-driving car might mistake a white tractor-trailer truck crossing a highway for the sky. But to be trustworthy, AI needs to be able to recognize those mistakes before it is too late.

Why is AI a threat to security? ›

Malicious actors may exploit vulnerabilities in insecurely sourced AI components to introduce backdoors, malware, or other malicious code into the system, potentially leading to data breaches, system compromises, or unauthorised access (Hu et al., 2021).

Discover More ›

What does AI mean in the context of security? ›

Artificial Intelligence (AI) is increasingly utilized in various security aspects to enhance threat detection, incident response, and overall cybersecurity. Here are some key ways AI is used in security: Advanced Threat Detection: Anomaly Detection: AI identifies unusual patterns indicating potential threats.

Know More ›

What's the best way to avoid failures when using AI in an organization? ›

To stay ahead, your business must conduct thorough user research, ask for feedback throughout the development process, and iterate based on your user insights. This iterative approach will ensure that the AI solutions address real-world problems and deliver tangible value to your end-users.

Know More ›

Can AI be trustworthy? ›

For AI systems to be trustworthy, they often need to be responsive to a multiplicity of criteria that are of value to interested parties. Approaches which enhance AI trustworthiness can reduce negative AI risks.

Tell Me More ›

What is the trust issue with AI? ›

At the root of this trust problem is the data that makes AI run. Nearly six in 10 AI users say it's difficult to get what they want out of AI, with more than half, 54%, claiming they don't trust the data used to train today's AI systems, a Salesforce survey of 6,000 global knowledge workers suggests.

Learn More Now ›

What is the biggest risk with AI? ›

Dangers of Artificial Intelligence

Automation-spurred job loss.
Deepfakes.
Privacy violations.
Algorithmic bias caused by bad data.
Socioeconomic inequality.
Market volatility.
Weapons automatization.
Uncontrollable self-aware AI.

What are the security vulnerabilities of AI? ›

it can be biased and is often gullible when responding to leading questions. it can be coaxed into creating toxic content and is prone to 'prompt injection attacks' it can be corrupted by manipulating the data used to train the model (a technique known as 'data poisoning')

Keep Reading ›

What is the disadvantages of AI in security? ›

Privacy Concerns

AI-powered cybersecurity tools gather information from various sources, and in the collection efforts, they commonly scoop up sensitive information. With threat actors targeting systems for this information, these data stores are at risk for cyberattacks and data breaches.

Show Me More ›

How can AI be used for security? ›

Artificial Intelligence (AI) improves security by enhancing threat detection, response capabilities, and overall cybersecurity measures in the following ways: Advanced Threat Detection and Real-time Monitoring: AI analyzes data for unusual patterns and behaviors, enabling early threat detection.

Keep Reading ›

How does AI detect threats? ›

The most common is using machine learning to identify patterns that would be challenging for humans to detect through manual analysis. For instance, an AI tool could parse hundreds of authentication log files, and then correlate the data across them with information from past security incidents.

View Details ›

What is the role of AI in data security? ›

The Role of AI in Data Security

AI systems are designed to continuously learn and adapt, allowing them to anticipate and respond to potential threats more efficiently. They can analyze vast quantities of data at an unprecedented speed, identifying patterns and anomalies that may indicate a security breach.

Why do 85% of AI projects fail? ›

According to one Gartner report, a staggering 85% of AI projects fail. Several factors contribute to this high failure rate, including poor data quality, lack of relevant data, and insufficient understanding of AI's capabilities and requirements.

Learn More ›

What are the 4 main problems AI can solve? ›

What problems can AI help us solve?

Automating Repetitive Tasks. ...
Data Analysis & Insights. ...
Personalization. ...
Predictive Maintenance. ...
Scientific Discovery and Research. ...
Robotics and Automation. ...
Drug Discovery and Development. ...
Climate Change and Sustainability.

More items...

Jul 22, 2024

Read The Full Story ›

How can you maintain trust with AI systems? ›

Protecting the privacy of individuals whose data may be used by or affected by AI systems is essential for maintaining trust. This means adhering to data protection laws, such as GDPR in Europe, and employing techniques like data anonymization and encryption.

Why we should not worry about AI? ›

In conclusion, AI is an exceptional technology that can greatly enhance our lives. Instead of fearing AI, we should focus on its benefits and how to use it responsibly and ethically. The future of AI is in our hands, and we should make sure that the development of AI is aligned with our values and principles.

Explore More ›

Why we shouldn't have artificial intelligence? ›

AI and deep learning models can be difficult to understand, even for those who work directly with the technology. This leads to a lack of transparency for how and why AI comes to its conclusions, creating a lack of explanation for what data AI algorithms use, or why they may make biased or unsafe decisions.

Tell Me More ›

What is the bad point of AI? ›

The disadvantages are things like costly implementation, potential human job loss, and lack of emotion and creativity.

Learn More Now ›

Why should we stop AI? ›

We may find it impossible to regain control of the technology we have created. The argument for a total AI ban arises from the technology's very nature—its technological evolution involves acceleration to speeds that defy human control or accountability.