TR-3B
TR-3B
### What & Why Users requesting `V100:2` on GCP were routed to `n1-highmem-16`; that VM family has scarce quota and often fails to schedule, while Vertex AI uses `n1-standard-8` for...
- add shared AgentAction schema + dispatcher with pydantic v2 validation - move s1/s2/s2.5 prompts and workers to schema-checked JSON output and structured response mode - update engine adapters and...
## Summary - introduce a tiny PlatformAdapter interface so each host (ChatGPT, Claude, etc.) can plug in cleanly - move ChatGPT and Claude scripts onto the adapter pattern and have...
PR Description (talking it through) Dropped a new op/mrope.py with the full multimodal rotary toolchain: cached inverse‑freq tables, tensor‑expr cos/sin kernels that run on the Unity TVM wheel, and the...
# Description Implements NUMA-aware tensor parallelism for MLC LLM to optimize performance on multi-socket CPU systems. ## Key Changes - **NUMA Topology Detection:** Automatic detection and mapping of CPU sockets...
Summary This pull request introduces comprehensive LoRA (Low-Rank Adaptation) adapter support to MLC-LLM, enabling efficient fine-tuned model deployment with minimal memory overhead. The implementation provides a complete end-to-end solution, including...
Surface realtime error request_id values by threading them through FalRealtimeError and covering the behaviour with Quick/Nimble specs.. addresses #4
# PR Description ## Overview This pull request introduces a robust, extensible real-time streaming architecture to the Windows Capture library. The new implementation enables direct access to encoded video (and...
Hooked PPO up with the missing telemetry: actor always reports entropy + grad norm, experience maker sends reward mean/std, trainer aliases ppo_kl, and W&B/TensorBoard now surface those signals in the...
- Introduces a new FSDP backend alongside DeepSpeed, selectable via `--dist_backend {deepspeed,fsdp}`. - Implements `FSDPStrategy` with: - FSDP auto-wrap (fallback to DDP if FSDP not available). - Standard dataloader/sampler, all_reduce/all_gather,...