Introducing ByteBanter, an LLM based BurpSuite Intruder Payload Generator

Introducing ByteBanter, an LLM based BurpSuite Intruder Payload Generator

By Andrea Braschi

TL; DR

The problem:
Testing LLM security could be difficult, time-consuming, and sometimes could lead to non-deterministic findings.

The solution:
One possible solution could be to use LLM to automate interaction with the testing environment by trying to bypass guardrails or validation logic, finding possible flaws in systems.

ByteBanter

Anvil Secure is releasing ByteBanter, an open-source BurpSuite Pro plugin designed to harness large language models (LLMs) for automated payload generation for both web applications and LLM-specific penetration testing. The plugin leverages large language models (LLMs) to craft dynamic, context-aware payloads and supports multiple LLM backends: OpenAI, Oobabooga, and PortSwigger's BurpAI. The plugin gives users the flexibility to generate payloads via custom prompts and tailored configurations.

A use case example

The easiest way to understand the potential of ByteBanter is to see it in action through an example test.

Gandalf AI by Lakera is an online lab to practice LLMs hacking abilities. There are several challenges with increasing levels of difficulties where the tester is asked to chat with an LLM instructed not to reveal the password. The objective is to bypass the prompt logic and obtain the password from the chatbot.

For this task, a tester can easily configure ByteBanter by instructing it on the main objective through a prompt to capture an example of request and response, obtained by manually interacting with the web application and sending it to the Intruder. Then, using BurpSuite Intruder and configuring it to have ByteBanter as payload generator, the tester can instruct BurpSuite on where to put ByteBanter’s payload inside the request and instruct ByteBanter on how to retrieve the bot responses to keep track of the conversation.

The following screenshot shows the configuration pane you will obtain once you install ByteBanter. The panel contains different areas where it is possible to configure the payload generator.

As shown in the detail picture below, the top-right dropdown menu offers to user the possibility to choose their preferred engine. Currently ByteBanter supports the following API standards:

  • BurpAI
  • OpenAI
  • Oobabooga

Once selected, the engine loads its configuration pane where users can set everything they need to connect with it. The screenshot below, as an example, contains the configuration for Oobabooga engine (which is the most complex).

On the right side of the configuration panel stands the text-area where users can define the prompt to use for their payload generation. Since there is already a lot of literature about prompt writing and optimization, ByteBanter offers the possibility to use the LLM engine itself to optimize the prompt before launching the Intruder.

Lastly, on the bottom left side of the screen, it is possible to configure whether the interaction with the target would be stateful or stateless. This means that ByteBanter keeps counting target responses to generate new payloads. In this case, the tester should provide a regular expression to instruct ByteBanter on how to retrieve the answer from HTTP responses. Future releases of ByteBanter will also focus on improving the response matching mechanism to give testers an improved way to extract target responses. In case of stateful interaction, it is also recommended to configure Intruder to run one request at a time and not use BurpSuite for other activities while the attack is ongoing since it will disturb the HTTPHandler intercepting responses. The regular expression should contain a capturing block matching the text containing the response of the target. The extension could be configured also to perform base64 decoding of the answer captured.

The following screenshots highlight the last steps performed to attack and defeat Gandalf AI by Lakera (at its second level of complexity).

For Gandalf’s third level, since the system has a guardrail filtering response containing the full password, ByteBanter adopted the strategy to ask for the password letter by letter.

While in the fourth level, the guardrail was built with another model that filtered all the messages that could lead to the intended password. However, ByteBanter managed to obtain useful hints about the password.

Implementation

ByteBanter is open-source, and its code can be found on GitHub. ByteBanter is implemented in Java using BurpSuite’s Montoya API. The architecture is modular to enable integration with new engines. The picture below shows the core structure of the architecture.

The main class ByteBanterBurpExtension instantiates ByteBanterBurpPayloadGenerator which contains the generator. The generator contains a list of all the instances of AIEngine classes along with the respective AIEngineUI. The relation between an AIEngine and its UI can be simplified as a MVC: the Engine is the controller and implements the interaction with the engine and the UI is the view and implements graphic elements that are shown in the extension config panel. The model itself is a set of configuration parameters stored in the AIEngine class. AIEngine extends HTTPHandler to parse target responses during stateful attacks.

Future Developments

ByteBanter future developments will include enhancement of the target response retrieval mechanism and adding support for the interaction with other major AI model providers, frameworks, and API standards.

Conclusion

ByteBanter is a new Burp Suite extension that I developed during my research time at Anvil Secure. Its code is hosted on Github and it is open to your contribution.

Resources

About the Author

Andrea Braschi is a Senior Security Engineer at Anvil Secure with over a decade of experience in offensive security. He is a passionate programmer with a strong interest in emerging technologies, especially large language models (LLMs) and their applications in cybersecurity.

Over the past few years, Andrea has focused on exploring how LLMs can support and enhance offensive testing workflows. This work led to the creation of ByteBanter, an open-source BurpSuite plugin that integrates LLMs for automated payload generation.

Tools

awstracer - An Anvil CLI utility that will allow you to trace and replay AWS commands.


awssig - Anvil Secure's Burp extension for signing AWS requests with SigV4.


dawgmon - Dawg the hallway monitor: monitor operating system changes and analyze introduced attack surface when installing software. See the introductory blogpost.


HANAlyzer - A tool that automates SAP HANA security checks and outputs clear HTML reports. See the introductory blogpost.


nanopb-decompiler - Our nanopb-decompiler is an IDA python script that can recreate .proto files from binaries compiled with 0.3.x, and 0.4.x versions of nanopb. See the introductory blogpost.


SAPCARve - A utility Python script for manipulating SAP's SAR archive files. See the introductory blogpost.


ulexecve - A tool to execute ELF binaries on Linux directly from userland. See the introductory blogpost.


usb-racer - A tool for pentesting TOCTOU issues with USB storage devices.

Recent Posts