Seven teams of cybersecurity researchers have been awarded $2 million each from the U.S. Defense Department for their work in creating artificial intelligence systems that can find and fix vulnerabilities and will now compete against one another in the final round. The two-year competition run through the Defense Advanced Research Projects Agency (DARPA), which was originally announced last year at the DEF CON hacking conference, pitted dozens of teams against each other in a contest to see who could use AI to create systems that can automatically secure the critical code that undergirds prominent systems used across the globe. More than 90 teams were whittled down to 39 before the semifinal competition took place. Each of those 39 teams was given access to AI tools provided by Google, Microsoft, OpenAI and Anthropic. The seven winners — made up of university researchers, students and others — were announced at the end of the DEF CON conference last weekend in Las Vegas. Teams were given multiple open source projects that contained intentionally-inserted vulnerabilities and were tasked with creating systems that could identify and patch the issues. Each of the projects were based on real-world projects critical to everything from healthcare to national security, including Jenkins, Linux kernel, Nginx, SQLite3, and Apache Tika. Andrew Carney, program manager for the AI Cyber Challenge, said the competition showed that “AI systems are capable of not only identifying but also patching vulnerabilities to safeguard the code that underpins critical infrastructure.” “We saw vulnerability discoveries in every Challenge Project – across vulnerability classes – and successful patches in four out of the five Challenge Projects,” he said. “What the competitors achieved on a condensed timeline and amidst a multitude of complexities is nothing short of remarkable." The AI Cyber Challenge was also done in collaboration with the Advanced Research Projects Agency for Health (ARPA-H), which is hard at work to find cybersecurity solutions to the pervasive cyberattacks on healthcare institutions. In total, the systems built by the teams found 22 unique, but synthetic, vulnerabilities and were able to patch 15. Several unique patches for C-based and Java-based challenges were also identified. One team even found a real-world bug in SQLite3, which was disclosed to the project’s maintainers. More details will be shared about the specifics of the final competition in the coming months. The most effective systems will receive cumulative awards of $29.5 million. The seven winning teams will now have one year to improve their technology before the final competition is held at the next DEF CON event in a year. ‘Critical load-bearing technology’ Carney told Recorded Future News that the idea behind the competition was to marry the opportunities presented by generative AI and large language models with the program analysis capabilities, security techniques and ingenuity of the cybersecurity research community. “The challenges themselves are very representative of real software, both from a scale perspective, from a feature perspective, like they look like real software in every way,” he said. When examining source code, developers can often get overwhelmed by the amount of semantic information, Carney said, explaining that AI can be helpful in allowing cybersecurity experts to focus on program analysis. The response to the challenge has proven this theory, with researchers eagerly iterating innovative ways to use AI to speed up the code analysis process in a number of ways. The rapid evolution of AI necessitated a quick response, particularly as hackers increasingly deploy the technologies for a variety of their own malicious tasks and for their own code review, Carney added. In order to accept the prizes and compete in the final, all of the teams have to agree to release the AI systems they created as open-source software under a license approved by the Open Source Initiative. Carney expressed hope that many of the projects that did not make it into the final round can be used elsewhere, and he has already thought of at least two U.S. agencies that could potentially deploy aspects of the projects created. “The competitors have done a tremendous amount of work. They've been dealing with a very hard problem, both from a technical challenge and just a software engineering challenge. Fully autonomous development like this is not easy,” he said. “I'm very hopeful that we will see folks — whether they're open sourcing their solutions, or they're spinning out companies, or they're incorporating them into other tool chains or workflows — that all of this work will go [somewhere]. There's a lot of value here, and I really hope folks realize and just leverage that in the ways that make sense for them.” Omkhar Arasaratnam, general manager at the Open Source Security Foundation (OpenSSF) and a challenge advisor, told Recorded Future News that open source software underpins over 90% of commercial software and is “critical load-bearing technology.” Both Carney and Arasaratnam said the competition showed that we are not far from a future with AI-led code analysis and patching, as evidenced by the real vulnerability found in the contest. “We hope to see fully autonomous Cyber Reasoning Systems (CRSs) that will identify and resolve security defects with minimal toil for maintainers. Next year, we anticipate even more efficient and accurate CRSs that can identify and remediate entire classes of vulnerabilities within open source software,” Arasaratnam said. “As CRSs continue to improve, we expect to discover even more vulnerabilities and patch them autonomously to help make open source software more secure for everyone.”
Get more insights with the
Recorded Future
Intelligence Cloud.
No previous article
No new articles
Jonathan Greig
is a Breaking News Reporter at Recorded Future News. Jonathan has worked across the globe as a journalist since 2014. Before moving back to New York City, he worked for news outlets in South Africa, Jordan and Cambodia. He previously covered cybersecurity at ZDNet and TechRepublic.