Peter
Peter
Same issue here, view with absolute positioning and negative `top` works on iOS but got cut off on Android
Unfortunately not working on my dual 3090 machine: ``` nvidia-smi Wed Jul 5 00:12:13 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 510.108.03 Driver Version: 510.108.03 CUDA Version: 11.6 | |-------------------------------+----------------------+----------------------+ | GPU Name...
thanks @barsuna , there is a comment about specifying max_mem per GPU in generate.py line 696: ``` if model is not None: # NOTE: Can specify max_memory={0: max_mem, 1: max_mem},...
I got it to work (both 8bit and 4bit) by commenting out the following in gen.py:  But it is very very slow, almost 1 sec per...
Thanks @pseudotensor !The generation speed is much better now (~10 English tokens per second, and 5 Chinese tokens per second) after pulling the latest changes
Seems #133 has been merged, but still got {"detail": "Not Found"} error . Or maybe I didn't run it correctly. It's not clear how to enable Openai-compliant API access, could...
Thanks @this . My understanding is that `h2ogpt_client` is a client that calls the backend h2ogpt's API, I don't need to run `h2ogpt_client` to get the backend API to work,...
Btw here is how I run the h2ogpt backend: ``` export ALLOW_API=1 python3 generate.py --base_model=$MODEL --langchain_mode=ChatLLM --visible_langchain_modes="['ChatLLM', 'UserData', 'MyData']" --score_model=None --max_max_new_tokens=2048 --max_new_tokens=512 --infer_devices=False --load_8bit=True --share=True ``` $MODEL is h2oai/h2ogpt-gm-oasst1-en-2048-falcon-40b-v2
Thank you, let me try tomorrow.
@Zengyi-Qin Are the pre-trained models **English only**? People have been training with Chinese data but not having good result.