TR-3B issues

Results 26 issues of


                                            TR-3B

gcp catalog: use n1-standard hosts for V100, halve vCPU-per-GPU & lower RAM ratio ➜ fixes #2239

### What & Why Users requesting `V100:2` on GCP were routed to `n1-highmem-16`; that VM family has scarce quota and often fails to schedule, while Vertex AI uses `n1-standard-8` for...

Enforce JSON Action Contract Across Agent Stacks

- add shared AgentAction schema + dispatcher with pydantic v2 validation - move s1/s2/s2.5 prompts and workers to schema-checked JSON output and structured response mode - update engine adapters and...

Refactor browser extension content scripts to use platform adapters

## Summary - introduce a tiny PlatformAdapter interface so each host (ChatGPT, Claude, etc.) can plug in cleanly - move ChatGPT and Claude scripts onto the adapter pattern and have...

extension

Add MRoPE operator stack and seed Qwen2.5‑VL integration

PR Description (talking it through) Dropped a new op/mrope.py with the full multimodal rotary toolchain: cached inverse‑freq tables, tensor‑expr cos/sin kernels that run on the Unity TVM wheel, and the...

NUMA-aware tensor parallelism for CPU inference

# Description Implements NUMA-aware tensor parallelism for MLC LLM to optimize performance on multi-socket CPU systems. ## Key Changes - **NUMA Topology Detection:** Automatic detection and mapping of CPU sockets...

LoRA Adapter Integration for MLC-LLM: Complete Runtime Support and Compilation Pipeline

Summary This pull request introduces comprehensive LoRA (Low-Rank Adaptation) adapter support to MLC-LLM, enabling efficient fine-tuned model deployment with minimal memory overhead. The implementation provides a complete end-to-end solution, including...

Propagate request identifiers with realtime errors

Surface realtime error request_id values by threading them through FalRealtimeError and covering the behaviour with Quick/Nimble specs.. addresses #4

Add Real-Time Encoded Frame Streaming and Network Transmission Support

# PR Description ## Overview This pull request introduces a robust, extensible real-time streaming architecture to the Windows Capture library. The new implementation enables direct access to encoded video (and...

Enhance PPO logging with entropy, reward stats, and grad norm insights

Hooked PPO up with the missing telemetry: actor always reports entropy + grad norm, experience maker sends reward mean/std, trainer aliases ppo_kl, and W&B/TensorBoard now surface those signals in the...

Add FSDP backend and --dist_backend flag across CLIs; introduce FSDPStrategy

- Introduces a new FSDP backend alongside DeepSpeed, selectable via `--dist_backend {deepspeed,fsdp}`. - Implements `FSDPStrategy` with: - FSDP auto-wrap (fallback to DDP if FSDP not available). - Standard dataloader/sampler, all_reduce/all_gather,...