Fanhai Lu comments

Results 9 comments of


                                            Fanhai Lu

[Remote-SSH Bug]: Resolver error: Error: Failed to download VS Code Server (Server returned 404)

Thanks @imsujinpark! I got same issues, after switching it to the release version (v.0.109.0), I can connect my vms.

float division by zero in benchmark

More logs after skip zero output: only 2 of 300 had zero length -------- output_len is zero for 238th request -------- output_len is zero for 288th request

Update JetStream grpc proto to support I/O with text and token ids

> > Any reason to add text back, I suggested we keep both str and id in response in #40. The answer is " don't want to decode it to...

Update JetStream grpc proto to support I/O with text and token ids

> > Any reason to add text back, I suggested we keep both str and id in response in #40. The answer is " don't want to decode it to...

Update JetStream grpc proto to support I/O with text and token ids

> > > > Any reason to add text back, I suggested we keep both str and id in response in #40. The answer is " don't want to decode...

Update JetStream grpc proto to support I/O with text and token ids

> * When input text, return both text and token ids. Is it still a streaming mode?

Performance optimized interleaved mode JetStream server

> * Optimized TPU duty cycle (largest gap < 4ms) > * Optimized TTFT: dispatch prefill tasks ASAP w/o unnecessary blocking in CPU, keep backpressure to enforce insert ASAP, return...

Empty response returned for prompt responses when using run_server_with_ray.py and batch_size > 1

Hi [richard](https://github.com/richardsliu), I tested the llama-2 7B with run_server_with_ray.py (--batch_size=32). Instead of sent request one by one, I use benchmark script to send 200 request and got 198 response back....

Add regression test to detect service broken and performance degradation

@qihqi @wang2yn84 Let's revisit this issue now. Having regression test is critical to catch the performance degradation. @sixiang-google Since the infra is ready, could you work on regression test for...