ProtectAI Adds Three Tools to Secure AI Models

Protect AI this week added three open source tools to detect threats to artificial intelligence (AI) models that are aimed at Jupyter Notebooks, large language models (LLMs) and the way those models are most commonly shared.

An NB Defense tool for Jupyter Notebooks scans notebooks and/or projects looking for issues pertaining to leaked credentials, personally identifiable information (PII) disclosure, licensing issues and security vulnerabilities.

Rebuff is a self-hardening prompt injection detection framework that thwarts prompt injection attacks that seek to compromise the integrity of an LLM in a way that might lead to errors, known as hallucinations, being deliberately generated. Rebuff employs heuristics to filter out potentially malicious input before it reaches the model. An LLM is also used to analyze incoming prompts and identify potential attacks. A database keeps track of known attacks, while canary tokens are used to make sure the malicious prompts that have been identified are not allowed back into the vector database.

ModelScan provides a tool that prevents code from being added to a model as it is being transferred during a process known as serialization. This type of attack is similar to a Trojan Horse exploit in that vulnerabilities are created that can later be used to steal credentials, purloin data or poison the AI model.

ProtectAI CEO Ian Swanson said as the usage of AI models continues to expand, cybersecurity teams are being called on to secure them. The data scientists that typically create these models have little to no cybersecurity expertise so it’s not uncommon for vulnerabilities to be found in either the AI model itself or any of the underlying components that were used to create it.

In fact, ProtectAI has launched Huntr, a bug bounty platform focused on identifying and fixing AI and machine learning vulnerabilities in open source AI models and the tools used to create them.

The biggest issue organizations face when it comes to securing AI models is a general lack of visibility into AI models, said Swanson. The immediate priority should be to make it simpler to create a bill of materials for AI models that makes it simpler to keep track of vulnerabilities in any of the code that was used to construct an AI model, he added.

It’s not clear what level of threat activity is specifically aimed at AI models, but the rate at which they are being built and deployed is already exceeding the available cybersecurity expertise. In one sense, AI models are just another software artifact that needs to be secured. However, far too many organizations are already struggling to lock down their software supply chains, of which AI models are now another element that are built using completely different types of toolsets.

It’s more a matter of when rather than if AI models running in a production environment are found to have been compromised. The challenge will be limiting the scope of that breach when an AI model may be operating at levels of scope and speed that make responding to the breach a much bigger challenge.