Jue WANG comments

Results 20 comments of


                                            Jue WANG

RuntimeError: index out of range: Tried to access index 512 out of table with 511 rows. at /pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:418

It seems that the sequence length is too long (i.e. # tokens > 512). if you are ok with limiting the max sequence length, then try truncating the input text...

Custom dataset - change from MD-RNN to CNN

If you want to reduce the memory usage and improve the training speed, I would recommend you to reduce the hidden size "hidden_dim" (e.g. 100) and the number of layers...

Can you give me a detailed explanation?

same question here

Add print statements to `pretrained/GPT-NeoX-20B/prepare.py` to show progress

Sure!

Add documentation for running inference on multiple GPUs

@satpalsr I can use multiple GPUs with the following snippet: ```python import torch from transformers import AutoConfig, AutoTokenizer from transformers import AutoModelForCausalLM from accelerate import dispatch_model, infer_auto_device_map from accelerate.utils import...

Issue Converting Weights to Huggingface Format

@davismartens The training script saves a ckpt per `CHECKPOINT_STEPS`, so usually you can just pick the latest one :)

Issue Converting Weights to Huggingface Format

> @LorrinWWW great thanks. Can I run the pretrained model without training too? Sure! You can run our pretrained base model.

Issue Converting Weights to Huggingface Format

@davismartens It appears that the `bot.py` is unable to locate the retrieval module, which should be present in the root directory of the `OpenChatKit` repository. Could you try running the...

Issue Converting Weights to Huggingface Format

@davismartens Can you try this? `export PYTHONPATH=/mnt/c/Users/davis/dev-projects/OpenChatKit:$PYTHONPATH`

Issue Converting Weights to Huggingface Format

@davismartens That's true, we specified `use_auth_token=True`.. You can either login HF: ```sh pip install --upgrade huggingface_hub huggingface-cli login ``` Or, since `togethercomputer/GPT-NeoXT-Chat-Base-20B` is publicly available now, you can simply remove...