llm_agents icon indicating copy to clipboard operation
llm_agents copied to clipboard

Potential Security Vulnerability: Arbitrary Code Execution via Direct Execution of LLM-Generated Content

Open glmgbj233 opened this issue 7 months ago • 1 comments

Description

In the llm_agents project, a potential security vulnerability has been identified where content generated by a Large Language Model (LLM) is directly executed without sufficient filtering or sandboxing. This can lead to prompt injection attacks, allowing malicious users to execute arbitrary code.

Details

  1. In the run method of agent.py, the tool_input generated by the LLM is passed to the use method of a tool:
  2. Specifically, when PythonREPLTool is used, its use method receives input_text and passes it to the run method of a PythonREPL instance:
# ... existing code ...
class PythonREPLTool(ToolInterface):
    # ... existing code ...
    def use(self, input_text: str) -> str:
        input_text = input_text.strip().strip("```")
        return self.python_repl.run(input_text)
# ... existing code ...
  1. The run method of the PythonREPL class uses the exec() function to execute the provided command:
# ... existing code ...
class PythonREPL(BaseModel):
    # ... existing code ...
    def run(self, command: str) -> str:
        # ... existing code ...
        exec(command, self.globals, self.locals)
        # ... existing code ...

Risk

Because the exec() function executes the input string as Python code, if the LLM is manipulated via a malicious prompt (i.e., prompt injection), it may generate a string containing harmful Python code. This code would then be executed in the agent's runtime environment, potentially leading to:

  • Data leakage
  • System compromise
  • Unauthorized operations

Recommended Mitigations

  1. Sandboxed Execution Environment:
    Execute Python code in a restricted sandboxed environment. Consider using the subprocess module in an isolated process or a specialized sandboxing library.

  2. Input Validation and Filtering:
    Apply strict validation and filtering to LLM-generated tool_input to ensure it does not contain potentially malicious code or commands. This may include whitelisting, blacklisting, or more advanced static code analysis.

  3. Principle of Least Privilege:
    Ensure that the environment in which the Python REPL runs has the least privileges necessary. Even if code execution occurs, the potential damage is minimized.


glmgbj233 avatar Jul 08 '25 01:07 glmgbj233

Yes sure, that is a well known fact.

For my use case it doesn't matter, but if this is important to you, feel free to submit a pull request.

mpaepper avatar Jul 08 '25 08:07 mpaepper