PentestGPT: Evaluating and Harnessing Large Language Models for Automated Penetration Testing

Page view(s)
56
Checked on May 13, 2025
PentestGPT: Evaluating and Harnessing Large Language Models for Automated Penetration Testing
Title:
PentestGPT: Evaluating and Harnessing Large Language Models for Automated Penetration Testing
Journal Title:
33rd USENIX Security Symposium (Usenix Security)
DOI:
Publication Date:
14 August 2024
Citation:
Deng, G., Liu, Y., Mayoral-Vilches, V., Liu, P., Li, Y., Xu, Y., Zhang, T., Liu, Y., Pinzger, M., & Rass, S. (2024). PentestGPT: Evaluating and Harnessing Large Language Models for Automated Penetration Testing. In 33rd USENIX Security Symposium (USENIX Security 24) (pp. 847–864). Philadelphia, PA: USENIX Association
Abstract:
Penetration testing, a crucial industrial practice for ensuring system security, has traditionally resisted automation due to the extensive expertise required by human professionals. Large Language Models (LLMs) have shown significant advancements in various domains, and their emergent abilities suggest their potential to revolutionize industries. In this work, we establish a comprehensive benchmark using real-world penetration testing targets and further use it to explore the capabilities of LLMs in this domain. Our findings reveal that while LLMs demonstrate proficiency in specific sub-tasks within the penetration testing process, such as using testing tools, interpreting outputs, and proposing subsequent actions, they also encounter difficulties maintaining a whole context of the overall testing scenario. Based on these insights, we introduce PENTESTGPT, an LLM-empowered automated penetration testing framework that leverages the abundant domain knowledge inherent in LLMs. PENTESTGPT is meticulously designed with three self-interacting modules, each addressing individual sub-tasks of penetration testing, to mitigate the challenges related to context loss. Our evaluation shows that PENTESTGPT not only outperforms LLMs with a task-completion increase of 228.6% compared to the GPT-3.5 model among the benchmark targets, but also proves effective in tackling real-world penetration testing targets and CTF challenges. Having been open-sourced on GitHub, PENTESTGPT has garnered over 6,500 stars in 12 months and fostered active community engagement, attesting to its value and impact in both the academic and industrial spheres.
License type:
Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Funding Info:
This research / project is supported by the National Research Foundation, Singapore, and the Cyber Security Agency of Singapore - National Cybersecurity R&D Programme
Grant Reference no. : NCRP25-P04-TAICeN
Description:
ISBN:
978-1-939133-44-1