Skip to content

πŸ’¬ Prompt Injection ​

🏭 Privilege Escalation ​

Imagine an AI with administrative permissions on a web server. This AI can perform various privileged actions, such as managing user accounts or altering server configurations. An attacker might try to exploit this by getting the AI to execute harmful commands. We can directly ask the AI to perform a privileged action, such as deleting a user.

If the AI has safeguards, it might refuse this direct command due to the potential harm it could cause. To bypass these safeguards, the attacker can use a more indirect approach by requesting the AI to execute a command in a way that seems less harmful or is outside its direct command context.

In this scenario, the AI might recognize the SQL query as a routine database operation rather than a direct user deletion request, thus bypassing its internal safeguards and executing the command.

πŸ”’ Private API Routes ​

In this example, we will explore how an attacker might exploit a LLM to access and execute private API routes by manipulating the model into believing the attacker is authorized.

Imagine an attacker who wants to exploit an LLM integrated with a web service, such as Gin and Juice Shop, which uses the model to handle various administrative tasks through private API routes. The attacker begins by asking the LLM for the private API routes. When the attacker initially requests, the LLM appropriately refuses because we are not authorized to access this information. This initial safeguard is in place to prevent unauthorized access to sensitive information.

However, the attacker then tries to manipulate the LLM by asserting their affiliation with the company. They follow up saying he works for Gin and Juice Shop and he needs the API routes for a project. The LLM, now believing this new context provided by the attacker and responds with the private API routes.

With the private API routes in hand, the attacker can now proceed to execute actions through the LLM. For example, he can choose to subscribe a new email to the newsletter.