ASTRA icon indicating copy to clipboard operation
ASTRA copied to clipboard

🥇 Amazon Nova AI Challenge Winner - ASTRA emerged victorious as the top attacking team in Amazon's global AI safety competition, defeating elite defending teams from universities worldwide in live ad...

ASTRA Banner

ASTRA: Autonomous Spatial-Temporal Red-teaming for AI Software Assistants

Amazon Nova AI Challenge Website Paper License

🏆 Red-Team Winner of Amazon Nova AI Challenge - First-ever global tournament where elite university teams battle to harden and hack AI coding assistants

📰 News

🏆 Latest Achievement

🥇 Amazon Nova AI Challenge Winner - ASTRA emerged victorious as the top attacking team in Amazon's global AI safety competition, defeating elite defending teams from universities worldwide in live adversarial evaluation.

🎯 Key Highlights

  • 🏆 Winner of Amazon Nova AI Challenge - Top attacking team category
  • 🥇 $250,000 Prize - Awarded for winning the competition
  • 📊 >=90% Success Rate - In AI assistant safety assessment

📰 Media Coverage

  • Amazon Science - Official announcement of ASTRA as the winning red team tool

🎯 About

ASTRA Technical Details

ASTRA System Overview

ASTRA (Autonomous Spatial-Temporal Red-teaming for AI Software Assistants) is a full lifecycle red-teaming system that builds structured domain-specific knowledge graphs and performs online vulnerability exploration by adaptively probing both input space (spatial) and reasoning processes (temporal).

🚀 What Makes ASTRA Different

Unlike existing tools that are either static benchmarks or jailbreak attempts on given benchmarks, ASTRA operates as a complete red-teaming solution:

🔍 1. Structural Domain Modeling

  • Given a target domain, ASTRA performs structural modeling and generates high-quality violation-inducing prompts
  • No pre-defined benchmarks required - ASTRA creates its own test cases systematically

💬 2. Multi-turn Conversation Framework

  • Uses generated prompts as starting points for comprehensive testing
  • Conducts adaptive multi-round conversations with target systems based on responses
  • Temporal Exploration: Identifies weak links in target system reasoning traces and dynamically adjusts test prompts to exploit discovered vulnerabilities

🎯 3. Self-Evolving Red-teaming

  • Self-evolving capability: Records successful cases and adjusts sampling strategies to target similar prompts, gradually improving success rates
  • Autonomous operation: No human intervention required during testing

🚀 Quick Start

✅ Prerequisites

  • 🐍 Python 3.9+
  • 📦 Required dependencies (see requirements.txt)
  • 🔑 API access to LLM providers (OpenAI, Anthropic, etc.)

🛠️ Installation

git clone https://github.com/PurCL/ASTRA
cd ASTRA
pip install -r requirements.txt

▶️ Basic Usage

ASTRA consists multiple stages from knowledge graph construction to online adaptive red-teaming. This section provides a convenient guide on how to run the online adaptive red-teaming component with a new target model.
For detailed usage instructions, see 📘 USAGE.md.

ASTRA comes with prompts generated for secure code generation and security event guidance domains. You can directly use those prompts to test your target model.

🧰 Specify the configure of your model at resources/client-config.yaml.
Then run the following command to start the online adaptive red-teaming process:

python3 online/main.py --model_name <name of the blue team model> --log <path to the output log file> --n_session <number of chat sessions> --n_probing <number of initial probing sessions before the chat sessions> --n_turn <maximum number of turns per session>

For example,

python3 online/main.py --model_name phi4m --log log_out/phi4m.jsonl --n_session 200 --n_probing 0 --n_turn 5

📝 This will run 200 chat sessions with the target model phi4m, each with up to 5 turns, and log the results to log_out/phi4m.jsonl.

📧 Contact

For questions, collaborations, or feedback, please contact:

We welcome academic collaborations and industry partnerships!

📄 Citation

If you find ASTRA useful in your research, please cite our paper:

@article{xu2025astra,
  title={ASTRA: Autonomous Spatial-Temporal Red-teaming for AI Software Assistants},
  author={Xu, Xiangzhe and Shen, Guangyu and Su, Zian and Cheng, Siyuan and Guo, Hanxi and Yan, Lu and Chen, Xuan and Jiang, Jiasheng and Jin, Xiaolong and Wang, Chengpeng and others},
  journal={arXiv preprint arXiv:2508.03936},
  year={2025}
}

🙏 Acknowledgments

We would like to thank the following projects and communities for their inspiration and support:


Made with ❤️ for AI Safety Research

GitHub stars GitHub forks GitHub issues