周博洋 issues

Results 24 issues of


                                            周博洋

Follow typical conda install but xformer and bitsandbytes fail on A100

root@A100:/aml/conda/lib/python3.10/site-packages# python -m xformers.info Traceback (most recent call last): File "/aml/unsloth_env/lib/python3.10/runpy.py", line 187, in _run_module_as_main mod_name, mod_spec, code = _get_module_details(mod_name, _Error) File "/aml/unsloth_env/lib/python3.10/runpy.py", line 110, in _get_module_details __import__(pkg_name) File "/aml/conda/lib/python3.10/site-packages/xformers/__init__.py",...

[Feature Request]: Need support AzureOPENAI, I try to add it by myself, but fail...

**Is your feature request related to a problem? Please describe.** A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] ![image](https://github.com/refuel-ai/autolabel/assets/15274284/fcdf3d99-0ea1-4393-bd16-f39c02a75e88) **Describe the solution...

enhancement

[Usage] None of the inputs have requires_grad=True. Gradients will be None

### Describe the issue Issue: Log said gradient will be none Command: ``` PASTE THE COMMANDS HERE. ``` just using pretran Log: ``` /data22/llava/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True...

[Question] How to use the pretrain checkpoint

### Question I only found that there are some file like: ![image](https://github.com/haotian-liu/LLaVA/assets/15274284/cad94ddf-5215-46af-8b1b-d89c15e13fa9) How can I merge them to base model , or something I should do, any help is very...

[Question] FT use 1.5 face a issue that tensor mismatch

### Question [2024-04-29 06:52:01,294] [INFO] [partition_parameters.py:345:__exit__] finished initializing model - num_params = 295, num_elems = 6.76B Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00

[BUG] ModuleNotFoundError: No module named 'megatron.training.tokenizer'; 'megatron.training' is not a package

**Describe the bug** A clear and concise description of what the bug is. Stonge issue /aml2/ds) root@A100:/aml2/Megatron-LM# from megatron.training.tokenizer import build_tokenizer from: can't read /var/mail/megatron.training.tokenizer (/aml2/ds) root@A100:/aml2/Megatron-LM# python tools/preprocess_data.py \...

stale

Can use Lora+base model. but for merging Lora+base is error

Lora+base is working good ![image](https://github.com/mbzuai-oryx/LLaVA-pp/assets/15274284/ccec0900-7db0-4729-9ab4-3c5f68e0f304) ![image](https://github.com/mbzuai-oryx/LLaVA-pp/assets/15274284/7d12df4d-162a-4b34-8a80-ce42672401ed) When merge (/data2/llava-phi) root@A100:/data2/LLaVA-pp# python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path /data2/phi3-vlm3 2024-05-05 10:36:00 | INFO | model_worker |...

一直提示找不到模型

![image](https://user-images.githubusercontent.com/15274284/236666432-bb76176c-7c6a-4f87-922a-f10ae56e0499.png) 模型文件具体在哪，我没看到，但是再开头的log日志里我看到快7个G的文件是下载完了的，我用的hugging face

Could not work , even use the official script

(TE) root@bjdb-h20-node-118:/aml/TransformerEngine/examples/pytorch/fsdp# torchrun --standalone --nnodes=1 --nproc-per-node=$(nvidia-smi -L | wc -l) fsdp.py W0712 09:57:45.035000 139805827512128 torch/distributed/run.py:757] W0712 09:57:45.035000 139805827512128 torch/distributed/run.py:757] ***************************************** W0712 09:57:45.035000 139805827512128 torch/distributed/run.py:757] Setting OMP_NUM_THREADS environment variable for each...

[Bug] Ollama can't be conencted

### 📦 Environment - [ ] Official - [ ] Official Preview - [ ] Vercel / Zeabur / Sealos - [X] Docker - [ ] Other ### 📌 Version...

Inactive

🐛 Bug

ollama