XXR
XXR
同时,label标签需要从0开始。不然会报错:RuntimeError: cuda runtime error (59) : device-side assert triggered at /home/lychee/mycode/pytorch/aten/src/THC/generic/THCTensorMath.cu:24
还有,return loss.data[0] IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number #原语句: train_loss+=loss.data[0] #修改后: train_loss+=loss.item()
你可能要重新写一个函数加载出来模型然后测试了。
> Even though,FT has not supported int8 quantization for popular LLMs,like bloom and so on quantizated to 4-bit not indeed accelerate models,it depends on your hardware computes, and the model...
not GPT-Q int8, just FT weight int8, see following link . https://github.com/NVIDIA/FasterTransformer/blob/f8e42aac45815c5be92c0915b12b9a6652386e8c/examples/pytorch/gpt/bloom_lambada.py#L165-L170 thank you for the llm-awq link, I will check it.
I met the same problem when I try to merge Deepseek llama model into Mixtral. https://huggingface.co/deepseek-ai/deepseek-llm-7b-base/tree/main It seems that some tensor key_name are not supported in merge-kit. We can load...
@Dongshengjiang 你好,我也复现了一下,效果确实不怎么好,确实更像是整体颜色的迁移。实际运用不推荐此篇论文。如果效果更好的话有另外两篇复现了效果很好。见下面两个,这两个我复现了效果很好: - [ICCV2019] Photorealistic Style Transfer via Wavelet Transforms [[paper](https://arxiv.org/abs/1903.09760)][[code](https://github.com/clovaai/WCT2)] - comment: Recommand to reimplement. Existing methods are limited by spatial distortions or unrealistic artifacts. This [paper](https://arxiv.org/abs/1904.11617) proposes a...
+1 for this.
@zhanghongyong123456 @jinghehehe 你好,出现这个问题的原因很简单,运行此模型需要两个文件,一个是vgg_checkpoint, 此模型作用是提取content_feature和style_feature;另一个是训练的模型,model_checkpoint, 这个模型用于执行迁移的任务。 如果没有指定正确的model_checkpoint, 就无法生成。 解决方法:最简单的,就是训练一个模型,可以少迭代几次,迅速把模型存储下来,再用这个训练的模型去运行。 1.创建两个文件夹,分别放入style_imgs和content_imgs (这个文件夹中的图像文件名不确定名称是否要一致,我的两个文件夹里面名称是一致) 2.用mian.py代码中,把epoch和save_interval改小。 3.运行训练,存储对应的model_checkpoint (代码中会自动存) 4.这个时候运行test.py,指定到对应的生成的model_checkpoint即可。
> I plan to do it. Did you get started ? how is it going now ?