李理
李理
I think tf.layers.dropout is designed for tf.estimator. See https://github.com/tensorflow/tensorflow/blob/r1.8/tensorflow/examples/tutorials/layers/cnn_mnist.py for an example. To use low level API, it seems we still need a placeholder.
I changed batch_size to 8 but it's still killed. [3256824.391743] Killed process 9666 (python) total-vm:53893188kB, anon-rss:23892380kB, file-rss:152808kB it use too much memory
so what's wrong? From the /var/log. it seems this python process used 23892380kB(23GB) cpu memory(not gpu memory). [3256824.391743] Killed process 9666 (python) total-vm:53893188kB, anon-rss:23892380kB, file-rss:152808kB
I face the same problem. When I add as bidai541 suggested, it seems to work. But When I run python setup.py test. it failed with:KeyError: "Registering two gradient with name...
http://fancyerii.github.io/2019/12/19/deepfm/
这是由于新版的Tensorflow的自定义Operation改了的原因。 https://github.com/fancyerii/deep_learning_theory_and_practice/blob/master/samples/ctc.pdf
see [this issue](https://github.com/TimDettmers/bitsandbytes/issues/1092#issuecomment-1969161870). QLora(load_in_4bits) is not compatible with fsdp/deepspeed. try with --use_bnb=False. I also recommend use deepspeed instead of fsdp. In my own experence, fsdp is not well implemented. you...
> 3\. AppImage installing glibc-2.34 in /opt do not work. I think we need use patchelf to modify rpath and interpreter. But when I run ldd audacity-linux-3.4.2-x64.AppImage, it says: ```...
I also request for integration with accelerate. I found a [old issue](https://github.com/huggingface/transformers/issues/18624) here but it seems it's not active.
yes, I think it should be documented clearly that users should segment their inputs or else their data will be truncated.