XXR comments

Results 22 comments of

XXR

python 3.6需要更改的写法

同时，label标签需要从0开始。不然会报错：RuntimeError: cuda runtime error (59) : device-side assert triggered at /home/lychee/mycode/pytorch/aten/src/THC/generic/THCTensorMath.cu:24

python 3.6需要更改的写法

还有，return loss.data[0] IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number #原语句： train_loss+=loss.data[0] #修改后： train_loss+=loss.item()

# feature request # GPT-Q 4 bit support

> Even though,FT has not supported int8 quantization for popular LLMs,like bloom and so on quantizated to 4-bit not indeed accelerate models,it depends on your hardware computes, and the model...

# feature request # GPT-Q 4 bit support

not GPT-Q int8, just FT weight int8, see following link . https://github.com/NVIDIA/FasterTransformer/blob/f8e42aac45815c5be92c0915b12b9a6652386e8c/examples/pytorch/gpt/bloom_lambada.py#L165-L170 thank you for the llm-awq link, I will check it.

KeyError: 'model.embed_tokens.weight' when using mergekit-moe

I met the same problem when I try to merge Deepseek llama model into Mixtral. https://huggingface.co/deepseek-ai/deepseek-llm-7b-base/tree/main It seems that some tensor key_name are not supported in merge-kit. We can load...

我也在复现这篇文章

@Dongshengjiang 你好，我也复现了一下，效果确实不怎么好，确实更像是整体颜色的迁移。实际运用不推荐此篇论文。如果效果更好的话有另外两篇复现了效果很好。见下面两个，这两个我复现了效果很好： - [ICCV2019] Photorealistic Style Transfer via Wavelet Transforms [[paper](https://arxiv.org/abs/1903.09760)][[code](https://github.com/clovaai/WCT2)] - comment: Recommand to reimplement. Existing methods are limited by spatial distortions or unrealistic artifacts. This [paper](https://arxiv.org/abs/1904.11617) proposes a...

Finetuning

+1 for this.

测试模型报错，请问需要怎么解决呢

@zhanghongyong123456 @jinghehehe 你好，出现这个问题的原因很简单，运行此模型需要两个文件，一个是vgg_checkpoint, 此模型作用是提取content_feature和style_feature；另一个是训练的模型，model_checkpoint, 这个模型用于执行迁移的任务。如果没有指定正确的model_checkpoint, 就无法生成。解决方法：最简单的，就是训练一个模型，可以少迭代几次，迅速把模型存储下来，再用这个训练的模型去运行。 1.创建两个文件夹，分别放入style_imgs和content_imgs (这个文件夹中的图像文件名不确定名称是否要一致，我的两个文件夹里面名称是一致) 2.用mian.py代码中，把epoch和save_interval改小。 3.运行训练，存储对应的model_checkpoint (代码中会自动存) 4.这个时候运行test.py，指定到对应的生成的model_checkpoint即可。

Support for hugging face GPTBigCode model

> I plan to do it. Did you get started ? how is it going now ?