zhou fan

Results 8 comments of zhou fan

@cole-h Yes, it works for me. But it uses the rustc I installed on my system, not the nix store.

This is ok for me. Thanks for the explanation @cole-h , I originally thought that riff wanted to use the `rustc` installed by nix like other tools like `cargo2nix`

> 在多轮对话之后出现,服务端不停输出token而不正常停止 在多轮对话之后出现,可以稳定复现吗? 我觉得我们可以先尝试排除是否是 vllm 的问题,建议使用 https://huggingface.co/TheBloke/Yi-34B-Chat-GPTQ#example-python-code 这种推理方式复现这个问题

@zhanghx0905 根据 issue https://github.com/vllm-project/vllm/issues/174. 看起来 vllm 还没有完成对 GPTQ quantization 的支持

Things would be easier after https://github.com/pytorch/torchtitan/pull/814 is merged. This PR introduces ModelSpec to describe a model and how to parallelize a model. Here is an example of a ModelSpec for...

You can try [ray distributed debugger](https://docs.ray.io/en/latest/ray-observability/ray-distributed-debugger.html)

Try adding this environment variable `export PYTHONUNBUFFERED=1`

How about using [`StatefulDataloader`](https://pytorch.org/data/beta/torchdata.stateful_dataloader.html) instead of `Dataloader`? `StatefulDataloader` provides `state_dict` and `load_state_dict` methods that may support resuming the iterator position of mid-epoch checkpointing.