claude-code icon indicating copy to clipboard operation
claude-code copied to clipboard

Proposal: Optional BoxLite integration for sandboxed code execution and plugin isolation

Open DorianZheng opened this issue 1 month ago β€’ 0 comments

Proposal: Optional BoxLite integration for sandboxed code execution and plugin isolation

Hi Claude Code team! πŸ‘‹

First off, amazing work on Claude Codeβ€”having an agentic coding tool that lives in the terminal and understands codebases is genuinely transformative for developer workflows. The plugin system with hooks, agents, and skills is particularly well-architected.

I've been working on BoxLite (github.com/boxlite-labs/boxlite), an embeddable VM runtime, and I think there's an interesting opportunity for optional integration that could enhance Claude Code's safety and reproducibilityβ€”especially around code execution, plugin isolation, and testing workflows.


Current State & Opportunity

Claude Code currently executes in the user's local environment, which is excellent for:

  • βœ… Speed - Direct execution without overhead
  • βœ… Simplicity - No additional infrastructure needed
  • βœ… Native access - Full access to local tools and files
  • βœ… Developer experience - Seamless integration with existing workflow

However, there are scenarios where optional sandboxing could provide additional value:

  • Testing code changes - Validate Claude's suggestions in isolation before applying to the real codebase
  • Plugin execution - Run community plugins safely without trusting them with host access
  • Hook isolation - Execute hooks in clean environments for reproducibility
  • Build/test commands - Run builds and tests in clean, consistent environments
  • Experimentation - Try risky operations without fear of breaking local setup
  • Team consistency - Ensure same behavior across different developers' machines

What is BoxLite?

BoxLite (github.com/boxlite-labs/boxlite) takes an "embeddable library" approach to sandboxingβ€”think SQLite for VMs. Instead of requiring Docker Desktop or a daemon, it's a library that provides hardware-level isolated environments.

Core characteristics:

  • Hardware virtualization (KVM/Hypervisor.framework) β€” Real VMs, not just containers
  • No daemon dependency β€” Just a Python library (pip install boxlite)
  • OCI-compatible β€” Uses standard Docker images from any registry
  • Cross-platform β€” macOS (Apple Silicon) and Linux (x86_64, ARM64)
  • Embeddable β€” No root required, works in serverless environments

Architecture:

Claude Code CLI
└── BoxLite Library (embedded, no daemon)
    └── Micro-VMs (hardware virtualized)
        └── OCI Containers (Linux environments)

Integration Use Cases

1. Safe Code Execution πŸ›‘οΈ

Current flow:

Claude suggests code β†’ Apply to local files β†’ Hope it works

With BoxLite (optional):

Claude suggests code β†’ Test in BoxLite VM β†’ Run tests β†’ Apply if successful

This enables "preview mode" where code changes are validated in isolation before touching the real codebase.

2. Plugin Isolation πŸ”’

Challenge: Community plugins have full access to the local environment

With BoxLite:

# Run untrusted plugin in isolated VM
claude plugin run community/awesome-plugin --sandbox=boxlite

# Plugin can't access host filesystem, network restrictions apply
# Safe to experiment with community plugins

3. Reproducible Testing βœ…

Challenge: Tests might pass locally but fail in CI (environment differences)

With BoxLite:

# Run tests in clean, consistent Linux environment
claude test --sandbox=boxlite

# Same environment every time, same as CI
# No accumulated state or local configuration interference

4. Hook Execution πŸͺ

Execute hooks in isolated environments:

  • PreToolUse hooks run in sandbox before actual execution
  • PostTaskComplete hooks verify changes in clean environment
  • Prevents hooks from interfering with host system

5. Build Commands πŸ—οΈ

# Build in clean environment
claude build --sandbox=boxlite

# No leftover build artifacts polluting local machine
# Reproducible builds across team

Proposed Integration Approach

Option 1: CLI Flag (Opt-in)

# Default: local execution (current behavior)
claude "refactor this function"

# Opt-in: sandboxed execution
claude --sandbox=boxlite "refactor this function"

# Configuration: default sandbox for specific commands
# .claude/settings.json
{
  "sandbox": {
    "enabled": false,  // global default
    "commands": {
      "test": "boxlite",     // always sandbox tests
      "build": "boxlite",    // always sandbox builds
      "plugin": "boxlite"    // always sandbox plugins
    }
  }
}

Option 2: Plugin Integration

Create a BoxLite plugin for Claude Code:

plugins/boxlite/
β”œβ”€β”€ .claude-plugin/
β”‚   └── plugin.json
β”œβ”€β”€ commands/
β”‚   └── sandbox.ts      # /sandbox command
β”œβ”€β”€ hooks/
β”‚   └── execution.ts    # Intercept and sandbox execution
└── README.md

Option 3: MCP Server

BoxLite as an MCP (Model Context Protocol) server:

  • Provides sandboxed execution tools
  • Claude Code connects via MCP
  • Users opt-in by configuring the MCP server

Code Example: BoxLite Integration

Here's how BoxLite's Python SDK works (could be adapted for Claude Code):

import boxlite

# Similar to what Claude Code could do internally
async with boxlite.CodeBox() as sandbox:
    # Copy proposed code changes to sandbox
    result = await sandbox.run("""
    # Claude's suggested code here
    def refactored_function():
        pass
    """)

    # Run tests in sandbox
    test_result = await sandbox.exec("pytest", "tests/")

    if test_result.exit_code == 0:
        # Tests passed, safe to apply to local codebase
        print("βœ“ Changes validated, applying to local files")
    else:
        # Tests failed, rollback
        print("βœ— Tests failed in sandbox, not applying changes")
        print(test_result.stderr)

Potential Benefits for Claude Code Users

1. Enhanced Safety

  • Test destructive operations without risk
  • Rollback on failure automatically
  • No accidental system modifications

2. Security

  • Hardware-level isolation for untrusted plugins
  • Prevent malicious code from accessing host
  • Safe experimentation with community contributions

3. Reproducibility

  • Same Linux environment on every machine
  • Consistent test/build results across team
  • Eliminate "works on my machine" issues

4. Developer Experience

  • No Docker Desktop needed on macOS
  • pip install boxlite and it works
  • Fast micro-VM startup (~100-500ms)

5. Cross-Platform

  • Consistent behavior macOS β†’ Linux
  • Same API, same isolation guarantees
  • Works in serverless/CI environments

Trade-offs & Considerations

When Local Execution is Better

  • βœ… Read-only operations - Viewing code, explanations, simple queries
  • βœ… Speed-critical tasks - Local execution is always faster
  • βœ… Simple commands - Git status, file listings, quick edits
  • βœ… Trusted operations - Changes from well-known, vetted code

When BoxLite Sandboxing Helps

  • βœ… Code execution - Testing Claude's suggestions before applying
  • βœ… Untrusted plugins - Community plugins, experimental features
  • βœ… Build/test commands - Reproducible, clean environments
  • βœ… Risky operations - Large refactors, file deletions, system commands
  • βœ… Team environments - Consistent behavior across developers

Recommendation: Make it optional, default off, users opt-in per command or via config.


BoxLite Status

  • Current version: 0.4.4 on PyPI
  • License: Apache 2.0 (same as Claude Code)
  • Platforms: macOS (Apple Silicon), Linux (x86_64, ARM64)
  • GitHub: https://github.com/boxlite-labs/boxlite
  • Python SDK: Stable, asyncio-native
  • Production readiness: Early stage, used in production by some teams

Potential Next Steps

If this seems interesting, I'd be happy to:

  1. Create a proof-of-concept plugin - Demonstrate sandboxed execution via Claude Code plugin
  2. Build an MCP server - Provide BoxLite capabilities via Model Context Protocol
  3. Collaborate on integration design - Work with your team on architecture
  4. Share performance benchmarks - Show overhead vs safety trade-offs

No pressureβ€”mainly wanted to share this in case it aligns with where Claude Code is heading, especially around safe code execution and plugin security.


Feedback Welcome

I'd love to hear your thoughts on:

  • Whether sandboxed execution aligns with Claude Code's vision
  • Which integration approach seems most natural (CLI flag, plugin, MCP server)
  • What security/safety features would be most valuable to users
  • Any concerns or considerations I might have missed

And if you're interested in BoxLite for other projects, feel free to check it outβ€”we're building in public and feedback helps! A ⭐ on GitHub would be appreciated if you find it useful.


Disclosure: I'm one of the BoxLite maintainers, but I genuinely think there's potential synergy here for making agentic coding tools safer and more reproducible. Looking forward to your thoughts!

DorianZheng avatar Dec 31 '25 10:12 DorianZheng