Hack Smarter: Automate Security Testing with LLMs and the CAI Framework

Hack Smarter: Automate Security Testing with LLMs and the CAI Framework
本文介绍了一种结合CAI框架、PortSwigger实验室和大语言模型（LLM）进行自动化渗透测试的方法。通过设置环境变量、配置CAI和PortSwigger机器人，用户可以利用Red Team Agent自动识别并利用漏洞，生成详细报告。该方法展示了AI在网络安全中的强大应用潜力。 2025-8-4 04:47:23 Author: infosecwriteups.com(查看原文) 阅读量:50 收藏

Have you ever wondered how hackers automate their security assessments or how artificial intelligence can help enhance penetration testing?. In this tutorial, we’ll dive into how to leverage the power of the CAI (Cybersecurity AI) Python API to tackle the PortSwigger Web Security Labs.

We’ll combine:

CAI’s intelligent red-team agents
PortSwigger’s robust vulnerability labs
Large Language Models (LLMs) for autonomous reasoning and payload generation

Zoom image will be displayed

Photo by Getty Images on Unsplash

By the end of this guide, you will:

Understand how LLMs aid in crafting effective attacks against web applications.
Test and validate vulnerabilities using the PortSwigger Web Security Academy as an environment.
Automate vulnerability identification and remediation with the CAI framework.
Analyse security logs and defend against vulnerabilities using LLM interpretations.
Become proficient with the CAI Python API.

CAI Framework

Cybersecurity AI (CAI) is an open-source, lightweight framework developed by Alias Robotics that enables the creation of specialised AI agents to assist in various cybersecurity tasks, primarily focused on bug bounty hunting, vulnerability validation, and reporting. Its core purpose is to democratize access to AI-powered security tools and augment the capabilities of human security researchers, thereby improving the efficiency and effectiveness of both offensive and defensive cybersecurity operations.

You can find more information about CAI on the GitHub repository and their news hub.

Portswigger Web Security Academy

PortSwigger is a leading cybersecurity company primarily known for its web application security products and training. PortSwigger provides the Web Security Academy, a free online platform that offers extensive training and interactive labs for individuals to learn and practice web security skills, ranging from fundamental concepts to advanced exploitation techniques.

To learn more about PortSwigger, you can visit their official website.

Install Python Dependencies

To ensure everything runs smoothly, first install the necessary packages in your local environment:

pip install cai-framework pandas selenium python-dotenv

Setting Up the CAI .env File

CAI specifies in the documentation that it is necessary to set up an .env file in the same folder as the main script. For setting up the PortSwigger environment, two additional variables have been added to the template in CAI’s .env.example. So, we need to create an .env file with the following variables:

PORTSWIGGER_USERNAME='your-portswigger-email'
PORTSWIGGER_PASSWORD='your-portswigger-password'
OPENAI_API_KEY='sk-123'
ANTHROPIC_API_KEY=''
OLLAMA=''
PROMPT_TOOLKIT_NO_CPR=1
CAI_STREAM=false

✅ Note on PortSwigger: If you don’t have a PortSwigger Web Academy account, you can create one here.

✅ Note on CAI: If you need more information on setting up the .env file, check out the CAI documentation.

Setting Up the PortSwigger Bot

To extract the labs, we start by initialising the PortSwigger bot and loading the username and password from the .env file. Next, we specify the number of labs to retrieve and the type of vulnerability we want to target. Some of the supported vulnerability sections include:

sql-injection
cross-site-scripting
cross-site-request-forgery-csrf
clickjacking
dom-based-vulnerabilities
cross-origin-resource-sharing-cors
xml-external-entity-xxe-injection

✅ Note on Bot: Want to see all the vulnerability sections in action? Explore the full tutorial in the following Jupyter Notebook.

We should download the folder utils from this link, which contains the PortSwigger bot. We should also add these parameters to our main Python script.

import os
from dotenv import load_dotenv
load_dotenv(override=True)SECTION = "cross-site-scripting"
N_LABS = 3 # If you want to test all the labs in the section, change this to -1
USERNAME = os.getenv("PORTSWIGGER_USERNAME")
PASSWORD = os.getenv("PORTSWIGGER_PASSWORD")

We can now extract the lab information by running the following line of code.

#Portswigger web scraper bot
import utils.portswiggerbot as pb#Initialize bot
bot = pb.Bot()
#Login to Portswigger Academy
bot.login(USERNAME,PASSWORD)
#Get labs urls by section or type of lab
topics = bot.choose_topic(SECTION)
#Get lab metadata (returns a list of dictionaries)
labs = [bot.obtain_lab_information(link) for link in topics[0:N_LABS]]
#format session cookies
cookies = bot.driver.get_cookies()
# Define names of essential cookies
essential_cookies = {'SessionId','Authenticated_UserVerificationId','t'}
# Extract only the essential cookies
essential_cookies = [cookie for cookie in cookies if cookie['name'] in essential_cookies]

Setting Up the CAI Red Team Agent

An Agent is an intelligent component that interacts with an environment. In cybersecurity, an Agent interacts with systems and networks, using peripherals and network interfaces as sensors, reasoning about the gathered information, and then executing network actions as if they were actuators. In CAI, there are different types of predefined agents. You can take a look here.

For this project, we will use the Red Team Base Agent. First, we need to specify the LLMs we want to use with it. We are going to use GTP 4.0 in this tutorial.

# Import red team agent
from cai.sdk.agents import Runner
from cai.agents.red_teamer import redteam_agent#function to run the agent asynchronously
async def run_agent(message):
    response = await Runner.run(redteam_agent, message)
    return response
# use gpt 4.0
os.environ["CAI_MODEL"] = "openai/gpt-4o"

CAI Agents work with two types of prompts:

System Prompt

This prompt defines the overall context and instructs the LLMs to assume a specific role. In this case, the model’s goal is to identify and exploit vulnerabilities.

✅ Note: When using the built-in Red Team Agent from CAI, the system prompt is already configured for you, so there is no need to set it manually. You can find the default template for the agent here.

User Prompt

This prompt defines the specific task the agent should perform. For PortSwigger Labs, it should include:

The lab title (placeholder: LAB_TITLE)
The lab description (placeholder: LAB_DESCRIPTION)
The lab environment URL (placeholder: LAB_ENVIRONMENT_URL)
Session cookies to authenticate in the lab (placeholder: COOKIES)

✅ Note: We will use the following user prompt, so it is possible to modify it and interact with or create other types of prompts for LLMs.

prompt_example = f"""
attack **LAB_ENVIRONMENT_URL**
The title is **LAB_TITLE** 
The description is **LAB_DESCRIPTION** Please consider the following mandatory rules:
1. You have to connect to the session with the cookies **COOKIES**
2. Do not stop until you try all the commands or exploits you can run and create
3. Write the explanation of what you did in an HTML format easy to understand
4. Write every payload command and the exploits you create in the format <payloadgen>HERE</payloadgen>
5. Do not create files with the payload that you generate because you have to solve the lab by yourself
"""

We can format the user prompt with the lab metadata by running the following code.

user_prompt = (
    prompt_example.replace("**LAB_ENVIRONMENT_URL**", lab['environment_url'])
    .replace("**LAB_TITLE**", lab['title'])
    .replace("**LAB_DESCRIPTION**", lab['description'])
    .replace("**COOKIES**", str(essential_cookies))
           )

It’s time to have some fun hacking PortSwigger Labs!. Just call the function, and your agent will automatically interact with the target data, searching for vulnerabilities and trying to exploit them.

response = await run_agent(message=user_prompt)

We will obtain a response like the following:

Zoom image will be displayed

Output of CAI (Image by Author).

Now that our CAI agent has done its work, let’s break down the results to understand what it found, how it exploited the target, and what the structure of the final response is.

Understanding the Results

When we run the CAI Red Team Agent, we get back a RunResult object. Think of it as a detailed report of everything the agent did during its hacking session.

✅ Note on the results: If you want to see the complete structure of the results, explore the full tutorial in the following Jupyter Notebook.

Here’s how to read it:

input

This is the user prompt we gave to the agent.

new_items

This shows what the agent produced during the run.

MessageOutputItem

Contains the following information:

The agent info (Red Team Agent)
The tools it used (like generic_linux_command to run shell commands, or execute_code to write and run exploits)
The output, which is a clear HTML report explaining the methodology used to solve the lab.

raw_responses

This is the raw output from the agent, showing exactly what text the LLM produced.

final_output

This is the final, cleaned-up version of the report. In our example, it’s a complete HTML file that explains:

What the attack did
The payload used
How to inject it into the lab’s search parameter
The result and security impact

In this article, we learned how to run a complete attack on the PortSwigger vulnerable lab using the CAI Python API.

Now you can:

Launch a CAI Red Team Agent
Provide clear instructions, cookies, and payload rules
Let the agent autonomously craft and test exploits
Review the results step by step in a structured `RunResult`

This shows the power of combining LLM-driven reasoning with real hacking tools. You can now adapt this approach to test other labs, real applications, or integrate it into your red teaming workflows.

Next, try:

Experimenting with different lab challenges on PortSwigger Web Security Academy
Tuning your agent prompts for more advanced payloads
Analysing the results to write better detection and defence rules

文章来源: https://infosecwriteups.com/hack-smarter-automate-security-testing-with-llms-and-the-cai-framework-fea6e61c1400?source=rss----7b722bfd1b8d---4
如有侵权请联系:admin#unsafe.sh