Local AI using CPU instead of GPU - CUDA
LocalAI version:
1.30.0 Latest.
Environment, CPU architecture, OS, and Version:
Window server 2022. Xeon E5 2670v2. GPU Geforece GTX 1070
Describe the bug
LocalAI using CPU instead of GPU. CUDA remains 0% When calling chat completeion.
To Reproduce
Expected behavior
Logs
Additional context
Configure in env:

Docker Compose:
version: '3.6'
services:
api:
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
image: quay.io/go-skynet/local-ai:master-cublas-cuda12
tty: true # enable colorized logs
restart: always # should this be on-failure ?
ports:
- 8080:8080
env_file:
- .env
volumes:
- ./models:/models
command: ["/usr/bin/local-ai" ]

I am new to this project, too. It looks like you need to set up gpu_layer in the config somewhere, but I don't know how.
I am new to this project, too. It looks like you need to set up gpu_layer in the config somewhere, but I don't know how.
I also tried to search over internet. however, I could not find out solution :) Hope any experts could read this thread.
:warning::warning::warning::warning::warning:
Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!
but.... I can also be funny or helpful :smile_cat: and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.
Don't engage in conversation with me,
I don't support (yet) replying!
:warning::warning::warning::warning::warning:
I see that you are using the LocalAI container image from quay.io, which should have CUDA support enabled by default. However, it seems like the GPU is not being used for some reason.
To troubleshoot this issue, I would recommend checking the following:
- Ensure that the GPU is properly connected and detected by the system. You can use a tool like NVIDIA X Server to check this.
- Check that the CUDA driver is installed and up-to-date. You can use the command
nvidia-cuda-toolkit to install the driver if needed.
- Ensure that the environment variables are properly set when running LocalAI. Make sure that
CUDA_DEVICE_POOL_GPU_OVERRIDE is set to 1 and CUDA_VISIBLE_DEVICES is set to 0-1. You can check this by looking at the logs when starting LocalAI in debug mode (--debug or DEBUG=true).
- Check that the model you are using is compatible with CUDA and has been compiled with the correct settings. You can use the command
localai models list to see a list of available models and their compatibility information.
- Ensure that the container image is up-to-date and has the latest CUDA drivers installed. You can try updating the image if necessary.
I hope this helps! Let me know if you have any further questions or need more assistance.
Sources:
- https://github.com/go-skynet/LocalAI/tree/master/.github/ISSUE_TEMPLATE/bug_report.md
- https://localai.io/faq/index.html
- https://localai.io/basics/getting_started/index.html
⚠️⚠️⚠️⚠️⚠️
Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!
but.... I can also be funny or helpful 😸 and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.
Don't engage in conversation with me, I don't support (yet) replying!
⚠️⚠️⚠️⚠️⚠️
I see that you are using the LocalAI container image from quay.io, which should have CUDA support enabled by default. However, it seems like the GPU is not being used for some reason. To troubleshoot this issue, I would recommend checking the following:
- Ensure that the GPU is properly connected and detected by the system. You can use a tool like NVIDIA X Server to check this.
- Check that the CUDA driver is installed and up-to-date. You can use the command
nvidia-cuda-toolkit to install the driver if needed.
- Ensure that the environment variables are properly set when running LocalAI. Make sure that
CUDA_DEVICE_POOL_GPU_OVERRIDE is set to 1 and CUDA_VISIBLE_DEVICES is set to 0-1. You can check this by looking at the logs when starting LocalAI in debug mode (--debug or DEBUG=true).
- Check that the model you are using is compatible with CUDA and has been compiled with the correct settings. You can use the command
localai models list to see a list of available models and their compatibility information.
- Ensure that the container image is up-to-date and has the latest CUDA drivers installed. You can try updating the image if necessary.
I hope this helps! Let me know if you have any further questions or need more assistance.
Sources:
- https://github.com/go-skynet/LocalAI/tree/master/.github/ISSUE_TEMPLATE/bug_report.md
- https://localai.io/faq/index.html
- https://localai.io/basics/getting_started/index.html
Hi localai-bot,
Thank you for your support. I tried above guides but still failed.
-
Ensure that the GPU is properly connected and detected by the system. You can use a tool like NVIDIA X Server to check this

-
Check that the CUDA driver is installed and up-to-date. Same above.
-
Ensure that the environment variables are properly set

-
Check that the model you are using is compatible with CUDA. I did downloaded and tried three models below. But still failed.

-
I pull latest LocalAI but still not working with CUDA.
Please help to process further. Thank you very much.
@noblerboy2004 please post your models yaml file for better review
I am new to this project, too. It looks like you need to set up gpu_layer in the config somewhere, but I don't know how.
https://localai.io/howtos/easy-model-import-downloaded/
@noblerboy2004 please post your models yaml file for better review
Hi Lunamidori5,
Thank you for your action.
Here is folder of downloaded models:

gpt4all-j-groovy working ok with CPU. No GPU usage -CUDO 0%
backend: gpt4all-j
context_size: 1024
name: gpt4all-j-groovy
parameters:
model: ggml-gpt4all-j-v1.3-groovy.bin
temperature: 0.2
top_k: 80
top_p: 0.7
template:
chat: gpt4all-chat
completion: gpt4all-completion
main_gpu: "0"
I Tried open-llama-3b-q4_0
backend: llama
context_size: 1024
name: openllama
f16: true ## If you are using cpu set this to false
gpu_layers: 4
batch: 512
parameters:
model: open-llama-3b-q4_0.bin
temperature: 0.2
top_k: 80
top_p: 0.7
template:
chat: openllama-chat
completion: openllama-completion
roles:
assistant: 'ASSISTANT:'
system: 'SYSTEM:'
user: 'USER:'
Content of openllama-chat.tmpl:
Q: {{.Input}}\nA:
Content of openllama-completion.tmpl:
Q: Complete the following text: {{.Input}}\nA:
And error occured with log:
2023-09-27 06:35:37 11:35PM DBG Request received:
2023-09-27 06:35:37 11:35PM DBG Configuration read: &{PredictionOptions:{Model:open-llama-3b-q4_0.bin Language: N:0 TopP:0.7 TopK:80 Temperature:0.9 Maxtokens:0 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:openllama F16:true Threads:32 Debug:true Roles:map[assistant:ASSISTANT: system:SYSTEM: user:USER:] Embeddings:false Backend:llama TemplateConfig:{Chat:openllama-chat ChatMessage: Completion:openllama-completion Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:4 MMap:false MMlock:false LowVRAM:false Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:1024 NUMA:false LoraAdapter: LoraBase: NoMulMatQ:false DraftModel: NDraft:0 Quantization:} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{PipelineType: SchedulerType: CUDA:false EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:}}
2023-09-27 06:35:37 11:35PM DBG Parameters: &{PredictionOptions:{Model:open-llama-3b-q4_0.bin Language: N:0 TopP:0.7 TopK:80 Temperature:0.9 Maxtokens:0 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:openllama F16:true Threads:32 Debug:true Roles:map[assistant:ASSISTANT: system:SYSTEM: user:USER:] Embeddings:false Backend:llama TemplateConfig:{Chat:openllama-chat ChatMessage: Completion:openllama-completion Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:4 MMap:false MMlock:false LowVRAM:false Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:1024 NUMA:false LoraAdapter: LoraBase: NoMulMatQ:false DraftModel: NDraft:0 Quantization:} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{PipelineType: SchedulerType: CUDA:false EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:}}
2023-09-27 06:35:37 11:35PM DBG Prompt (before templating): USER: How are you?
2023-09-27 06:35:37 11:35PM DBG Template found, input modified to: Q: USER: How are you?\nA:
2023-09-27 06:35:37
2023-09-27 06:35:37 11:35PM DBG Prompt (after templating): Q: USER: How are you?\nA:
2023-09-27 06:35:37
2023-09-27 06:35:37 11:35PM DBG Loading model llama from open-llama-3b-q4_0.bin
2023-09-27 06:35:37 11:35PM DBG Loading model in memory from file: /models/open-llama-3b-q4_0.bin
2023-09-27 06:35:37 11:35PM DBG Loading GRPC Model llama: {backendString:llama model:open-llama-3b-q4_0.bin threads:32 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc000503ba0 externalBackends:map[autogptq:/build/extra/grpc/autogptq/autogptq.py bark:/build/extra/grpc/bark/ttsbark.py diffusers:/build/extra/grpc/diffusers/backend_diffusers.py exllama:/build/extra/grpc/exllama/exllama.py huggingface-embeddings:/build/extra/grpc/huggingface/huggingface.py vall-e-x:/build/extra/grpc/vall-e-x/ttsvalle.py vllm:/build/extra/grpc/vllm/backend_vllm.py] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false}
2023-09-27 06:35:37 11:35PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama
2023-09-27 06:35:37 11:35PM DBG GRPC Service for open-llama-3b-q4_0.bin will be running at: '127.0.0.1:40367'
2023-09-27 06:35:37 11:35PM DBG GRPC Service state dir: /tmp/go-processmanager3853150060
2023-09-27 06:35:37 11:35PM DBG GRPC Service Started
2023-09-27 06:35:37 rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:40367: connect: connection refused"
2023-09-27 06:35:37 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr 2023/09/26 23:35:37 gRPC Server listening at 127.0.0.1:40367
2023-09-27 06:35:39 11:35PM DBG GRPC Service Ready
2023-09-27 06:35:39 11:35PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:} sizeCache:0 unknownFields:[] Model:open-llama-3b-q4_0.bin ContextSize:1024 Seed:0 NBatch:512 F16Memory:true MLock:false MMap:false VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:4 MainGPU: TensorSplit: Threads:32 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/models/open-llama-3b-q4_0.bin Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 Tokenizer: LoraBase: LoraAdapter: NoMulMatQ:false DraftModel: AudioPath: Quantization:}
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr SIGILL: illegal instruction
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr PC=0x89fedc m=3 sigcode=2
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr signal arrived during cgo execution
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr instruction bytes: 0xc4 0xe3 0x7d 0x39 0x8c 0x24 0x18 0x3 0x0 0x0 0x1 0x66 0x89 0x84 0x24 0x0
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr goroutine 38 [syscall]:
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.cgocall(0x822db0, 0xc000341530)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc000341508 sp=0xc0003414d0 pc=0x418c8b
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr github.com/go-skynet/go-llama%2ecpp._Cfunc_load_model(0x7f5b58000cd0, 0x400, 0x0, 0x1, 0x0, 0x0, 0x0, 0x0, 0x4, 0x200, ...)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr _cgo_gotypes.go:267 +0x4f fp=0xc000341530 sp=0xc000341508 pc=0x81808f
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr github.com/go-skynet/go-llama%2ecpp.New({0xc0002280a0, 0x1e}, {0xc00022b600, 0x9, 0x9370e0?})
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /build/go-llama/llama.go:39 +0x385 fp=0xc000341740 sp=0xc000341530 pc=0x818a85
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr github.com/go-skynet/LocalAI/pkg/backend/llm/llama.(*LLM).Load(0xc000012630, 0xc000300820)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /build/pkg/backend/llm/llama/llama.go:87 +0xc9c fp=0xc000341958 sp=0xc000341740 pc=0x81e11c
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr github.com/go-skynet/LocalAI/pkg/grpc.(*server).LoadModel(0xc000036d90, {0xc000300820?, 0x50e946?}, 0x0?)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /build/pkg/grpc/server.go:50 +0xe6 fp=0xc000341a08 sp=0xc000341958 pc=0x820e46
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr github.com/go-skynet/LocalAI/pkg/grpc/proto._Backend_LoadModel_Handler({0x9a95a0?, 0xc000036d90}, {0xa90270, 0xc00022c5d0}, 0xc0003460e0, 0x0)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /build/pkg/grpc/proto/backend_grpc.pb.go:264 +0x169 fp=0xc000341a60 sp=0xc000341a08 pc=0x80d4a9
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr google.golang.org/grpc.(*Server).processUnaryRPC(0xc0001fc1e0, {0xa933f8, 0xc0003001a0}, 0xc000356000, 0xc0001fecc0, 0x1189570, 0x0)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:1376 +0xde7 fp=0xc000341e40 sp=0xc000341a60 pc=0x7f6767
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr google.golang.org/grpc.(*Server).handleStream(0xc0001fc1e0, {0xa933f8, 0xc0003001a0}, 0xc000356000, 0x0)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:1753 +0x9e7 fp=0xc000341f68 sp=0xc000341e40 pc=0x7fb427
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr google.golang.org/grpc.(*Server).serveStreams.func1.1()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:998 +0x8d fp=0xc000341fe0 sp=0xc000341f68 pc=0x7f450d
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.goexit()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000341fe8 sp=0xc000341fe0 pc=0x47bfc1
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr created by google.golang.org/grpc.(*Server).serveStreams.func1 in goroutine 37
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:996 +0x165
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr goroutine 1 [IO wait]:
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.gopark(0x42aea8?, 0x7f5b5ef00228?, 0x78?, 0xdb?, 0x4e847d?)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc0001edb08 sp=0xc0001edae8 pc=0x44d44e
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.netpollblock(0xc0001edb98?, 0x418426?, 0x0?)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/netpoll.go:564 +0xf7 fp=0xc0001edb40 sp=0xc0001edb08 pc=0x445ed7
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr internal/poll.runtime_pollWait(0x7f5b5ef82eb0, 0x72)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/netpoll.go:343 +0x85 fp=0xc0001edb60 sp=0xc0001edb40 pc=0x476ee5
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr internal/poll.(*pollDesc).wait(0xc0001b8680?, 0x0?, 0x0)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0001edb88 sp=0xc0001edb60 pc=0x4e10e7
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr internal/poll.(*pollDesc).waitRead(...)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/internal/poll/fd_poll_runtime.go:89
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr internal/poll.(*FD).Accept(0xc0001b8680)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/internal/poll/fd_unix.go:611 +0x2ac fp=0xc0001edc30 sp=0xc0001edb88 pc=0x4e65cc
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr net.(*netFD).accept(0xc0001b8680)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/net/fd_unix.go:172 +0x29 fp=0xc0001edce8 sp=0xc0001edc30 pc=0x644a69
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr net.(*TCPListener).accept(0xc0000c04c0)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/net/tcpsock_posix.go:152 +0x1e fp=0xc0001edd10 sp=0xc0001edce8 pc=0x65ba1e
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr net.(*TCPListener).Accept(0xc0000c04c0)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/net/tcpsock.go:315 +0x30 fp=0xc0001edd40 sp=0xc0001edd10 pc=0x65abd0
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr google.golang.org/grpc.(*Server).Serve(0xc0001fc1e0, {0xa8f828?, 0xc0000c04c0})
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:859 +0x462 fp=0xc0001ede80 sp=0xc0001edd40 pc=0x7f31c2
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr github.com/go-skynet/LocalAI/pkg/grpc.StartServer({0x7ffdb27f5b6b?, 0xc000024160?}, {0xa93ee0?, 0xc000012630})
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /build/pkg/grpc/server.go:178 +0x17d fp=0xc0001edf10 sp=0xc0001ede80 pc=0x82283d
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr main.main()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /build/cmd/grpc/llama/main.go:22 +0x85 fp=0xc0001edf40 sp=0xc0001edf10 pc=0x8229e5
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.main()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/proc.go:267 +0x2bb fp=0xc0001edfe0 sp=0xc0001edf40 pc=0x44cffb
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.goexit()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0001edfe8 sp=0xc0001edfe0 pc=0x47bfc1
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr goroutine 2 [force gc (idle)]:
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc0000a0fa8 sp=0xc0000a0f88 pc=0x44d44e
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.goparkunlock(...)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/proc.go:404
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.forcegchelper()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/proc.go:322 +0xb3 fp=0xc0000a0fe0 sp=0xc0000a0fa8 pc=0x44d2d3
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.goexit()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0000a0fe8 sp=0xc0000a0fe0 pc=0x47bfc1
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr created by runtime.init.6 in goroutine 1
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/proc.go:310 +0x1a
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr goroutine 3 [GC sweep wait]:
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc0000a1778 sp=0xc0000a1758 pc=0x44d44e
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.goparkunlock(...)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/proc.go:404
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.bgsweep(0x0?)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/mgcsweep.go:280 +0x94 fp=0xc0000a17c8 sp=0xc0000a1778 pc=0x439354
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.gcenable.func1()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/mgc.go:200 +0x25 fp=0xc0000a17e0 sp=0xc0000a17c8 pc=0x42e4e5
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.goexit()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0000a17e8 sp=0xc0000a17e0 pc=0x47bfc1
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr created by runtime.gcenable in goroutine 1
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/mgc.go:200 +0x66
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr goroutine 4 [GC scavenge wait]:
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.gopark(0xc0000ca000?, 0xa88a70?, 0x1?, 0x0?, 0xc0000071e0?)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc0000a1f70 sp=0xc0000a1f50 pc=0x44d44e
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.goparkunlock(...)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/proc.go:404
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.(*scavengerState).park(0x11d2900)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc0000a1fa0 sp=0xc0000a1f70 pc=0x436be9
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.bgscavenge(0x0?)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/mgcscavenge.go:653 +0x3c fp=0xc0000a1fc8 sp=0xc0000a1fa0 pc=0x43717c
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.gcenable.func2()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/mgc.go:201 +0x25 fp=0xc0000a1fe0 sp=0xc0000a1fc8 pc=0x42e485
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.goexit()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0000a1fe8 sp=0xc0000a1fe0 pc=0x47bfc1
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr created by runtime.gcenable in goroutine 1
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/mgc.go:201 +0xa5
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr goroutine 5 [finalizer wait]:
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.gopark(0x9d39e0?, 0x10044e501?, 0x0?, 0x0?, 0x455605?)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc0000a0628 sp=0xc0000a0608 pc=0x44d44e
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.runfinq()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/mfinal.go:193 +0x107 fp=0xc0000a07e0 sp=0xc0000a0628 pc=0x42d567
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.goexit()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0000a07e8 sp=0xc0000a07e0 pc=0x47bfc1
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr created by runtime.createfing in goroutine 1
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/mfinal.go:163 +0x3d
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr goroutine 35 [select]:
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.gopark(0xc00034ff00?, 0x2?, 0x0?, 0x0?, 0xc00034fecc?)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc00034fd78 sp=0xc00034fd58 pc=0x44d44e
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.selectgo(0xc00034ff00, 0xc00034fec8, 0xc00034fee8?, 0x0, 0x96f980?, 0x1)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/select.go:327 +0x725 fp=0xc00034fe98 sp=0xc00034fd78 pc=0x45cea5
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc00031e050, 0x1)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/controlbuf.go:418 +0x113 fp=0xc00034ff30 sp=0xc00034fe98 pc=0x76c193
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc000346000)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/controlbuf.go:552 +0x86 fp=0xc00034ff90 sp=0xc00034ff30 pc=0x76c8c6
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr google.golang.org/grpc/internal/transport.NewServerTransport.func2()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:341 +0xd5 fp=0xc00034ffe0 sp=0xc00034ff90 pc=0x783835
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.goexit()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00034ffe8 sp=0xc00034ffe0 pc=0x47bfc1
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr created by google.golang.org/grpc/internal/transport.NewServerTransport in goroutine 34
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:338 +0x1b0c
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr goroutine 36 [select]:
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.gopark(0xc000306f70?, 0x4?, 0xe0?, 0x5?, 0xc000306ec0?)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc000306d28 sp=0xc000306d08 pc=0x44d44e
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.selectgo(0xc000306f70, 0xc000306eb8, 0x0?, 0x0, 0x0?, 0x1)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/select.go:327 +0x725 fp=0xc000306e48 sp=0xc000306d28 pc=0x45cea5
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr google.golang.org/grpc/internal/transport.(*http2Server).keepalive(0xc0003001a0)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:1155 +0x225 fp=0xc000306fc8 sp=0xc000306e48 pc=0x78ac85
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr google.golang.org/grpc/internal/transport.NewServerTransport.func4()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:344 +0x25 fp=0xc000306fe0 sp=0xc000306fc8 pc=0x783725
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.goexit()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000306fe8 sp=0xc000306fe0 pc=0x47bfc1
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr created by google.golang.org/grpc/internal/transport.NewServerTransport in goroutine 34
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:344 +0x1b4e
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr goroutine 37 [IO wait]:
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.gopark(0x11eaa60?, 0xb?, 0x0?, 0x0?, 0x6?)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc00030eaa8 sp=0xc00030ea88 pc=0x44d44e
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.netpollblock(0x4c6378?, 0x418426?, 0x0?)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/netpoll.go:564 +0xf7 fp=0xc00030eae0 sp=0xc00030eaa8 pc=0x445ed7
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr internal/poll.runtime_pollWait(0x7f5b5ef82db8, 0x72)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/netpoll.go:343 +0x85 fp=0xc00030eb00 sp=0xc00030eae0 pc=0x476ee5
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr internal/poll.(*pollDesc).wait(0xc00022a000?, 0xc000316000?, 0x0)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00030eb28 sp=0xc00030eb00 pc=0x4e10e7
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr internal/poll.(*pollDesc).waitRead(...)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/internal/poll/fd_poll_runtime.go:89
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr internal/poll.(*FD).Read(0xc00022a000, {0xc000316000, 0x8000, 0x8000})
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/internal/poll/fd_unix.go:164 +0x27a fp=0xc00030ebc0 sp=0xc00030eb28 pc=0x4e23da
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr net.(*netFD).Read(0xc00022a000, {0xc000316000?, 0x1060100000000?, 0x8?})
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/net/fd_posix.go:55 +0x25 fp=0xc00030ec08 sp=0xc00030ebc0 pc=0x642a45
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr net.(*conn).Read(0xc00022e000, {0xc000316000?, 0x0?, 0xc00030ecd8?})
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/net/net.go:179 +0x45 fp=0xc00030ec50 sp=0xc00030ec08 pc=0x653145
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr net.(*TCPConn).Read(0x0?, {0xc000316000?, 0xc00030eca8?, 0x46b32d?})
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr :1 +0x25 fp=0xc00030ec80 sp=0xc00030ec50 pc=0x6658e5
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr bufio.(*Reader).Read(0xc000314000, {0xc000328040, 0x9, 0xc13cf892d832e0b5?})
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/bufio/bufio.go:244 +0x197 fp=0xc00030ecb8 sp=0xc00030ec80 pc=0x5bdf17
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr io.ReadAtLeast({0xa8d2e0, 0xc000314000}, {0xc000328040, 0x9, 0x9}, 0x9)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/io/io.go:335 +0x90 fp=0xc00030ed00 sp=0xc00030ecb8 pc=0x4c0570
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr io.ReadFull(...)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/io/io.go:354
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr golang.org/x/net/http2.readFrameHeader({0xc000328040, 0x9, 0xc00029c048?}, {0xa8d2e0?, 0xc000314000?})
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/golang.org/x/[email protected]/http2/frame.go:237 +0x65 fp=0xc00030ed50 sp=0xc00030ed00 pc=0x758f25
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr golang.org/x/net/http2.(*Framer).ReadFrame(0xc000328000)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/golang.org/x/[email protected]/http2/frame.go:498 +0x85 fp=0xc00030edf8 sp=0xc00030ed50 pc=0x759665
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams(0xc0003001a0, 0x0?, 0x0?)
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:642 +0x165 fp=0xc00030ef10 sp=0xc00030edf8 pc=0x786aa5
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr google.golang.org/grpc.(*Server).serveStreams(0xc0001fc1e0, {0xa933f8?, 0xc0003001a0})
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:985 +0x149 fp=0xc00030ef80 sp=0xc00030ef10 pc=0x7f4289
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr google.golang.org/grpc.(*Server).handleRawConn.func1()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:927 +0x45 fp=0xc00030efe0 sp=0xc00030ef80 pc=0x7f3b65
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr runtime.goexit()
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00030efe8 sp=0xc00030efe0 pc=0x47bfc1
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr created by google.golang.org/grpc.(*Server).handleRawConn in goroutine 34
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:926 +0x185
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr rax 0x0
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr rbx 0xab7620
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr rcx 0x7f5b650341a0
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr rdx 0x7f5bd563d6d8
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr rdi 0x7f5bd563d6c8
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr rsi 0x7f5bd5635e38
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr rbp 0x7f5b650342c0
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr rsp 0x7f5b65033f40
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr r8 0x0
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr r9 0x7f5b58000080
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr r10 0xfffffffffffffaac
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr r11 0x7f5bd5540990
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr r12 0x1
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr r13 0x7f5b65034060
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr r14 0x7f5b65033ff0
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr r15 0x7f5b65034160
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr rip 0x89fedc
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr rflags 0x10246
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr cs 0x33
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr fs 0x0
2023-09-27 06:35:39 11:35PM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:40367): stderr gs 0x0
2023-09-27 06:35:39 [172.18.0.1]:40862 500 - POST /v1/chat/completions
2023-09-27 06:35:42 [127.0.0.1]:46062 200 - GET /readyz
@noblerboy2004 You have GPU layers set to 0, So 0% of your GPU will be used... Here is a fixed yaml for your easy copy and paste make sure to RESTART localai after changing a yaml file
backend: llama-stable
context_size: 1024
name: openllama
f16: true
gpu_layers: 30
parameters:
model: open-llama-3b-q4_0.bin
temperature: 0.2
top_k: 80
top_p: 0.7
template:
chat: openllama-chat
completion: openllama-completion
roles:
assistant: 'ASSISTANT:'
system: 'SYSTEM:'
user: 'USER:'
backend: llama-stable
context_size: 1024
f16: true
gpu_layers: 30
name: gpt4all-j-groovy
parameters:
model: ggml-gpt4all-j-v1.3-groovy.bin
temperature: 0.2
top_k: 80
top_p: 0.7
template:
chat: gpt4all-chat
completion: gpt4all-completion
You will need to fix the formatting of the yaml files before restarting localai
As a note, gpt4all is not fully supported at this time, and the open-llama model uses llama-stable not llama
If you would like more info on setting up a model
https://localai.io/howtos/
https://localai.io/howtos/easy-model-import-downloaded/
https://localai.io/advanced/
backend: llama-stable
context_size: 1024
name: openllama
f16: true
gpu_layers: 30
parameters:
model: open-llama-3b-q4_0.bin
temperature: 0.2
top_k: 80
top_p: 0.7
template:
chat: openllama-chat
completion: openllama-completion
roles:
assistant: 'ASSISTANT:'
system: 'SYSTEM:'
user: 'USER:'
Hi luminadori5,
Thank you for your quick action. I did above guide. However, still failed with openllama with log:
2023-09-27 07:04:18 [127.0.0.1]:59474 200 - GET /readyz
2023-09-27 07:04:20 12:04AM DBG Request received:
2023-09-27 07:04:20 12:04AM DBG Configuration read: &{PredictionOptions:{Model:open-llama-3b-q4_0.bin Language: N:0 TopP:0.7 TopK:80 Temperature:0.9 Maxtokens:0 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:openllama F16:true Threads:32 Debug:true Roles:map[assistant:ASSISTANT: system:SYSTEM: user:USER:] Embeddings:false Backend:llama TemplateConfig:{Chat:openllama-chat ChatMessage: Completion:openllama-completion Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:1000 MMap:false MMlock:false LowVRAM:false Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:1024 NUMA:false LoraAdapter: LoraBase: NoMulMatQ:false DraftModel: NDraft:0 Quantization:} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{PipelineType: SchedulerType: CUDA:false EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:}}
2023-09-27 07:04:20 12:04AM DBG Parameters: &{PredictionOptions:{Model:open-llama-3b-q4_0.bin Language: N:0 TopP:0.7 TopK:80 Temperature:0.9 Maxtokens:0 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:openllama F16:true Threads:32 Debug:true Roles:map[assistant:ASSISTANT: system:SYSTEM: user:USER:] Embeddings:false Backend:llama TemplateConfig:{Chat:openllama-chat ChatMessage: Completion:openllama-completion Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:1000 MMap:false MMlock:false LowVRAM:false Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:1024 NUMA:false LoraAdapter: LoraBase: NoMulMatQ:false DraftModel: NDraft:0 Quantization:} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{PipelineType: SchedulerType: CUDA:false EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:}}
2023-09-27 07:04:20 12:04AM DBG Prompt (before templating): USER: How are you?
2023-09-27 07:04:20 12:04AM DBG Template found, input modified to: Q: USER: How are you?\nA:
2023-09-27 07:04:20
2023-09-27 07:04:20 12:04AM DBG Prompt (after templating): Q: USER: How are you?\nA:
2023-09-27 07:04:20
2023-09-27 07:04:20 12:04AM DBG Loading model llama from open-llama-3b-q4_0.bin
2023-09-27 07:04:20 12:04AM DBG Loading model in memory from file: /models/open-llama-3b-q4_0.bin
2023-09-27 07:04:20 12:04AM DBG Loading GRPC Model llama: {backendString:llama model:open-llama-3b-q4_0.bin threads:32 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc000681040 externalBackends:map[autogptq:/build/extra/grpc/autogptq/autogptq.py bark:/build/extra/grpc/bark/ttsbark.py diffusers:/build/extra/grpc/diffusers/backend_diffusers.py exllama:/build/extra/grpc/exllama/exllama.py huggingface-embeddings:/build/extra/grpc/huggingface/huggingface.py vall-e-x:/build/extra/grpc/vall-e-x/ttsvalle.py vllm:/build/extra/grpc/vllm/backend_vllm.py] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false}
2023-09-27 07:04:20 12:04AM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama
2023-09-27 07:04:20 12:04AM DBG GRPC Service for open-llama-3b-q4_0.bin will be running at: '127.0.0.1:34155'
2023-09-27 07:04:20 12:04AM DBG GRPC Service state dir: /tmp/go-processmanager1578684964
2023-09-27 07:04:20 12:04AM DBG GRPC Service Started
2023-09-27 07:04:20 rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:34155: connect: connection refused"
2023-09-27 07:04:21 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr 2023/09/27 00:04:21 gRPC Server listening at 127.0.0.1:34155
2023-09-27 07:04:22 12:04AM DBG GRPC Service Ready
2023-09-27 07:04:22 12:04AM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:} sizeCache:0 unknownFields:[] Model:open-llama-3b-q4_0.bin ContextSize:1024 Seed:0 NBatch:512 F16Memory:true MLock:false MMap:false VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:1000 MainGPU: TensorSplit: Threads:32 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/models/open-llama-3b-q4_0.bin Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 Tokenizer: LoraBase: LoraAdapter: NoMulMatQ:false DraftModel: AudioPath: Quantization:}
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr SIGILL: illegal instruction
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr PC=0x89fedc m=3 sigcode=2
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr signal arrived during cgo execution
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr instruction bytes: 0xc4 0xe3 0x7d 0x39 0x8c 0x24 0x18 0x3 0x0 0x0 0x1 0x66 0x89 0x84 0x24 0x0
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr goroutine 22 [syscall]:
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.cgocall(0x822db0, 0xc000195530)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc000195508 sp=0xc0001954d0 pc=0x418c8b
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr github.com/go-skynet/go-llama%2ecpp._Cfunc_load_model(0x7f3428000cd0, 0x400, 0x0, 0x1, 0x0, 0x0, 0x0, 0x0, 0x3e8, 0x200, ...)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr _cgo_gotypes.go:267 +0x4f fp=0xc000195530 sp=0xc000195508 pc=0x81808f
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr github.com/go-skynet/go-llama%2ecpp.New({0xc0001200c0, 0x1e}, {0xc00012f600, 0x9, 0x9370e0?})
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /build/go-llama/llama.go:39 +0x385 fp=0xc000195740 sp=0xc000195530 pc=0x818a85
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr github.com/go-skynet/LocalAI/pkg/backend/llm/llama.(*LLM).Load(0xc000012630, 0xc0001029c0)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /build/pkg/backend/llm/llama/llama.go:87 +0xc9c fp=0xc000195958 sp=0xc000195740 pc=0x81e11c
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr github.com/go-skynet/LocalAI/pkg/grpc.(*server).LoadModel(0xc000036d90, {0xc0001029c0?, 0x50e946?}, 0x0?)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /build/pkg/grpc/server.go:50 +0xe6 fp=0xc000195a08 sp=0xc000195958 pc=0x820e46
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr github.com/go-skynet/LocalAI/pkg/grpc/proto._Backend_LoadModel_Handler({0x9a95a0?, 0xc000036d90}, {0xa90270, 0xc00012a600}, 0xc00011e150, 0x0)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /build/pkg/grpc/proto/backend_grpc.pb.go:264 +0x169 fp=0xc000195a60 sp=0xc000195a08 pc=0x80d4a9
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr google.golang.org/grpc.(*Server).processUnaryRPC(0xc0001fc1e0, {0xa933f8, 0xc000102340}, 0xc000152000, 0xc0001fecc0, 0x1189570, 0x0)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:1376 +0xde7 fp=0xc000195e40 sp=0xc000195a60 pc=0x7f6767
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr google.golang.org/grpc.(*Server).handleStream(0xc0001fc1e0, {0xa933f8, 0xc000102340}, 0xc000152000, 0x0)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:1753 +0x9e7 fp=0xc000195f68 sp=0xc000195e40 pc=0x7fb427
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr google.golang.org/grpc.(*Server).serveStreams.func1.1()
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:998 +0x8d fp=0xc000195fe0 sp=0xc000195f68 pc=0x7f450d
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.goexit()
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000195fe8 sp=0xc000195fe0 pc=0x47bfc1
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr created by google.golang.org/grpc.(*Server).serveStreams.func1 in goroutine 21
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:996 +0x165
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr goroutine 1 [IO wait]:
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.gopark(0x4c80f0?, 0xc0001edb28?, 0x78?, 0xdb?, 0x4e847d?)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc0001edb08 sp=0xc0001edae8 pc=0x44d44e
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.netpollblock(0x47a032?, 0x418426?, 0x0?)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/netpoll.go:564 +0xf7 fp=0xc0001edb40 sp=0xc0001edb08 pc=0x445ed7
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr internal/poll.runtime_pollWait(0x7f343829feb0, 0x72)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/netpoll.go:343 +0x85 fp=0xc0001edb60 sp=0xc0001edb40 pc=0x476ee5
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr internal/poll.(*pollDesc).wait(0xc0001b8680?, 0x4?, 0x0)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0001edb88 sp=0xc0001edb60 pc=0x4e10e7
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr internal/poll.(*pollDesc).waitRead(...)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/internal/poll/fd_poll_runtime.go:89
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr internal/poll.(*FD).Accept(0xc0001b8680)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/internal/poll/fd_unix.go:611 +0x2ac fp=0xc0001edc30 sp=0xc0001edb88 pc=0x4e65cc
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr net.(*netFD).accept(0xc0001b8680)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/net/fd_unix.go:172 +0x29 fp=0xc0001edce8 sp=0xc0001edc30 pc=0x644a69
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr net.(*TCPListener).accept(0xc0000c04c0)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/net/tcpsock_posix.go:152 +0x1e fp=0xc0001edd10 sp=0xc0001edce8 pc=0x65ba1e
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr net.(*TCPListener).Accept(0xc0000c04c0)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/net/tcpsock.go:315 +0x30 fp=0xc0001edd40 sp=0xc0001edd10 pc=0x65abd0
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr google.golang.org/grpc.(*Server).Serve(0xc0001fc1e0, {0xa8f828?, 0xc0000c04c0})
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:859 +0x462 fp=0xc0001ede80 sp=0xc0001edd40 pc=0x7f31c2
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr github.com/go-skynet/LocalAI/pkg/grpc.StartServer({0x7ffd405ffb6b?, 0xc000024160?}, {0xa93ee0?, 0xc000012630})
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /build/pkg/grpc/server.go:178 +0x17d fp=0xc0001edf10 sp=0xc0001ede80 pc=0x82283d
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr main.main()
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /build/cmd/grpc/llama/main.go:22 +0x85 fp=0xc0001edf40 sp=0xc0001edf10 pc=0x8229e5
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.main()
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/proc.go:267 +0x2bb fp=0xc0001edfe0 sp=0xc0001edf40 pc=0x44cffb
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.goexit()
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0001edfe8 sp=0xc0001edfe0 pc=0x47bfc1
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr goroutine 2 [force gc (idle)]:
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc0000a0fa8 sp=0xc0000a0f88 pc=0x44d44e
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.goparkunlock(...)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/proc.go:404
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.forcegchelper()
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/proc.go:322 +0xb3 fp=0xc0000a0fe0 sp=0xc0000a0fa8 pc=0x44d2d3
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.goexit()
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0000a0fe8 sp=0xc0000a0fe0 pc=0x47bfc1
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr created by runtime.init.6 in goroutine 1
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/proc.go:310 +0x1a
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr goroutine 3 [GC sweep wait]:
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc0000a1778 sp=0xc0000a1758 pc=0x44d44e
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.goparkunlock(...)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/proc.go:404
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.bgsweep(0x0?)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/mgcsweep.go:280 +0x94 fp=0xc0000a17c8 sp=0xc0000a1778 pc=0x439354
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.gcenable.func1()
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/mgc.go:200 +0x25 fp=0xc0000a17e0 sp=0xc0000a17c8 pc=0x42e4e5
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.goexit()
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0000a17e8 sp=0xc0000a17e0 pc=0x47bfc1
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr created by runtime.gcenable in goroutine 1
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/mgc.go:200 +0x66
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr goroutine 4 [GC scavenge wait]:
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.gopark(0xc0000ca000?, 0xa88a70?, 0x1?, 0x0?, 0xc0000071e0?)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc0000a1f70 sp=0xc0000a1f50 pc=0x44d44e
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.goparkunlock(...)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/proc.go:404
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.(*scavengerState).park(0x11d2900)
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc0000a1fa0 sp=0xc0000a1f70 pc=0x436be9
2023-09-27 07:04:22 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.bgscavenge(0x0?)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/mgcscavenge.go:653 +0x3c fp=0xc0000a1fc8 sp=0xc0000a1fa0 pc=0x43717c
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.gcenable.func2()
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/mgc.go:201 +0x25 fp=0xc0000a1fe0 sp=0xc0000a1fc8 pc=0x42e485
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.goexit()
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0000a1fe8 sp=0xc0000a1fe0 pc=0x47bfc1
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr created by runtime.gcenable in goroutine 1
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/mgc.go:201 +0xa5
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr goroutine 5 [finalizer wait]:
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.gopark(0x9d39e0?, 0x10044e501?, 0x0?, 0x0?, 0x455605?)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc0000a0628 sp=0xc0000a0608 pc=0x44d44e
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.runfinq()
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/mfinal.go:193 +0x107 fp=0xc0000a07e0 sp=0xc0000a0628 pc=0x42d567
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.goexit()
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0000a07e8 sp=0xc0000a07e0 pc=0x47bfc1
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr created by runtime.createfing in goroutine 1
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/mfinal.go:163 +0x3d
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr goroutine 19 [select]:
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.gopark(0xc00014ff00?, 0x2?, 0x0?, 0x0?, 0xc00014fecc?)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc00014fd78 sp=0xc00014fd58 pc=0x44d44e
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.selectgo(0xc00014ff00, 0xc00014fec8, 0xc00014fee8?, 0x0, 0x96f980?, 0x1)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/select.go:327 +0x725 fp=0xc00014fe98 sp=0xc00014fd78 pc=0x45cea5
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc0001141e0, 0x1)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/controlbuf.go:418 +0x113 fp=0xc00014ff30 sp=0xc00014fe98 pc=0x76c193
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc00011e070)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/controlbuf.go:552 +0x86 fp=0xc00014ff90 sp=0xc00014ff30 pc=0x76c8c6
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr google.golang.org/grpc/internal/transport.NewServerTransport.func2()
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:341 +0xd5 fp=0xc00014ffe0 sp=0xc00014ff90 pc=0x783835
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.goexit()
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00014ffe8 sp=0xc00014ffe0 pc=0x47bfc1
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr created by google.golang.org/grpc/internal/transport.NewServerTransport in goroutine 18
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:338 +0x1b0c
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr goroutine 20 [select]:
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.gopark(0xc00009c770?, 0x4?, 0xe0?, 0x6?, 0xc00009c6c0?)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc00009c528 sp=0xc00009c508 pc=0x44d44e
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.selectgo(0xc00009c770, 0xc00009c6b8, 0x0?, 0x0, 0x0?, 0x1)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/select.go:327 +0x725 fp=0xc00009c648 sp=0xc00009c528 pc=0x45cea5
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr google.golang.org/grpc/internal/transport.(*http2Server).keepalive(0xc000102340)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:1155 +0x225 fp=0xc00009c7c8 sp=0xc00009c648 pc=0x78ac85
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr google.golang.org/grpc/internal/transport.NewServerTransport.func4()
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:344 +0x25 fp=0xc00009c7e0 sp=0xc00009c7c8 pc=0x783725
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.goexit()
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00009c7e8 sp=0xc00009c7e0 pc=0x47bfc1
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr created by google.golang.org/grpc/internal/transport.NewServerTransport in goroutine 18
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:344 +0x1b4e
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr goroutine 21 [IO wait]:
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.gopark(0x11eaa60?, 0xb?, 0x0?, 0x0?, 0x6?)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc0000b1aa8 sp=0xc0000b1a88 pc=0x44d44e
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.netpollblock(0x4c6378?, 0x418426?, 0x0?)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/netpoll.go:564 +0xf7 fp=0xc0000b1ae0 sp=0xc0000b1aa8 pc=0x445ed7
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr internal/poll.runtime_pollWait(0x7f343829fdb8, 0x72)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/netpoll.go:343 +0x85 fp=0xc0000b1b00 sp=0xc0000b1ae0 pc=0x476ee5
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr internal/poll.(*pollDesc).wait(0xc00012e000?, 0xc000130000?, 0x0)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0000b1b28 sp=0xc0000b1b00 pc=0x4e10e7
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr internal/poll.(*pollDesc).waitRead(...)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/internal/poll/fd_poll_runtime.go:89
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr internal/poll.(*FD).Read(0xc00012e000, {0xc000130000, 0x8000, 0x8000})
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/internal/poll/fd_unix.go:164 +0x27a fp=0xc0000b1bc0 sp=0xc0000b1b28 pc=0x4e23da
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr net.(*netFD).Read(0xc00012e000, {0xc000130000?, 0x1060100000000?, 0x8?})
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/net/fd_posix.go:55 +0x25 fp=0xc0000b1c08 sp=0xc0000b1bc0 pc=0x642a45
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr net.(*conn).Read(0xc000116008, {0xc000130000?, 0x0?, 0xc0000b1cd8?})
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/net/net.go:179 +0x45 fp=0xc0000b1c50 sp=0xc0000b1c08 pc=0x653145
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr net.(*TCPConn).Read(0x0?, {0xc000130000?, 0xc0000b1ca8?, 0x46b32d?})
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr :1 +0x25 fp=0xc0000b1c80 sp=0xc0000b1c50 pc=0x6658e5
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr bufio.(*Reader).Read(0xc0001102a0, {0xc000140040, 0x9, 0xc13cfa41baf16b48?})
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/bufio/bufio.go:244 +0x197 fp=0xc0000b1cb8 sp=0xc0000b1c80 pc=0x5bdf17
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr io.ReadAtLeast({0xa8d2e0, 0xc0001102a0}, {0xc000140040, 0x9, 0x9}, 0x9)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/io/io.go:335 +0x90 fp=0xc0000b1d00 sp=0xc0000b1cb8 pc=0x4c0570
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr io.ReadFull(...)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/io/io.go:354
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr golang.org/x/net/http2.readFrameHeader({0xc000140040, 0x9, 0xc00028e000?}, {0xa8d2e0?, 0xc0001102a0?})
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/golang.org/x/[email protected]/http2/frame.go:237 +0x65 fp=0xc0000b1d50 sp=0xc0000b1d00 pc=0x758f25
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr golang.org/x/net/http2.(*Framer).ReadFrame(0xc000140000)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/golang.org/x/[email protected]/http2/frame.go:498 +0x85 fp=0xc0000b1df8 sp=0xc0000b1d50 pc=0x759665
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams(0xc000102340, 0x0?, 0x0?)
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:642 +0x165 fp=0xc0000b1f10 sp=0xc0000b1df8 pc=0x786aa5
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr google.golang.org/grpc.(*Server).serveStreams(0xc0001fc1e0, {0xa933f8?, 0xc000102340})
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:985 +0x149 fp=0xc0000b1f80 sp=0xc0000b1f10 pc=0x7f4289
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr google.golang.org/grpc.(*Server).handleRawConn.func1()
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:927 +0x45 fp=0xc0000b1fe0 sp=0xc0000b1f80 pc=0x7f3b65
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr runtime.goexit()
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0000b1fe8 sp=0xc0000b1fe0 pc=0x47bfc1
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr created by google.golang.org/grpc.(*Server).handleRawConn in goroutine 18
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr /go/pkg/mod/google.golang.org/[email protected]/server.go:926 +0x185
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr rax 0x0
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr rbx 0xab7620
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr rcx 0x7f3438b111a0
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr rdx 0x7f34a911a6d8
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr rdi 0x7f34a911a6c8
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr rsi 0x7f34a9112e38
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr rbp 0x7f3438b112c0
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr rsp 0x7f3438b10f40
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr r8 0x0
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr r9 0x7f3428000080
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr r10 0xfffffffffffffaac
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr r11 0x7f34a901d990
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr r12 0x1
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr r13 0x7f3438b11060
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr r14 0x7f3438b10ff0
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr r15 0x7f3438b11160
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr rip 0x89fedc
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr rflags 0x10246
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr cs 0x33
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr fs 0x0
2023-09-27 07:04:23 12:04AM DBG GRPC(open-llama-3b-q4_0.bin-127.0.0.1:34155): stderr gs 0x0
2023-09-27 07:04:23 [172.18.0.1]:59482 500 - POST /v1/chat/completions
For gpt4all-j-groovy, when changing backend to llama-stable, get similar problem above.
THank you.
When i tried to set gpu-layer for gpt4all-j-groovy with gpt4all backend.

LocalAi use CPU instead of GPU.

I am new to this project, too. It looks like you need to set up gpu_layer in the config somewhere, but I don't know how.
Hi Lunamidori5,yhyu13,
I tried with the following steps below and working now for lunademo.
CUDA Not working
(Note: when docker compose again, all previous install will be erased)
Follow the link: https://localai.io/howtos/easy-model-import-downloaded/
1. Download model: https://huggingface.co/TheBloke/Luna-AI-Llama2-Uncensored-GGML/blob/main/luna-ai-llama2-uncensored.ggmlv3.q5_K_M.bin
2. Create 3 file
In the "lunademo-chat.tmpl" file add
{{.Input}}
ASSISTANT:
In the "lunademo-completion.tmpl" file add
Complete the following sentence: {{.Input}}
In the "lunademo.yaml" file (If you want to see advanced yaml configs - Link)
backend: llama-stable
context_size: 2000
f16: true ## If you are using cpu set this to false
gpu_layers: 30
batch: 512
name: lunademo
parameters:
model: luna-ai-llama2-uncensored.ggmlv3.q5_K_M.bin
temperature: 0.2
top_k: 40
top_p: 0.65
roles:
assistant: 'ASSISTANT:'
system: 'SYSTEM:'
user: 'USER:'
template:
chat: lunademo-chat
completion: lunademo-completion
3. Edit .env file: Notice to remove -DLLAMA_AVX=OFF out of the string CMAKE. Because our CPU support AVX (not support AVX2, AVX512)
CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF -DLLAMA_F16C=OFF"
CUDA_VISIBLE_DEVICES=0-1
CUDA_DEVICE_POOL_GPU_OVERRIDE=1
## Set number of threads.
## Note: prefer the number of physical cores. Overbooking the CPU degrades performance notably.
THREADS=32
## Specify a different bind address (defaults to ":8080")
# ADDRESS=127.0.0.1:8080
## Default models context size
# CONTEXT_SIZE=512
#
## Define galleries.
## models will to install will be visible in `/models/available`
GALLERIES=[{"name":"model-gallery", "url":"github:go-skynet/model-gallery/index.yaml"}, {"url": "github:go-skynet/model-gallery/huggingface.yaml","name":"huggingface"}]
## CORS settings
# CORS=true
# CORS_ALLOW_ORIGINS=*
## Default path for models
#
MODELS_PATH=/models
## Enable debug mode
DEBUG=true
## Disables COMPEL (Diffusers)
# COMPEL=0
## Enable/Disable single backend (useful if only one GPU is available)
# SINGLE_ACTIVE_BACKEND=true
## Specify a build type. Available: cublas, openblas, clblas.
## cuBLAS: This is a GPU-accelerated version of the complete standard BLAS (Basic Linear Algebra Subprograms) library. It's provided by Nvidia and is part of their CUDA toolkit.
## OpenBLAS: This is an open-source implementation of the BLAS library that aims to provide highly optimized code for various platforms. It includes support for multi-threading and can be compiled to use hardware-specific features for additional performance. OpenBLAS can run on many kinds of hardware, including CPUs from Intel, AMD, and ARM.
## clBLAS: This is an open-source implementation of the BLAS library that uses OpenCL, a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors. clBLAS is designed to take advantage of the parallel computing power of GPUs but can also run on any hardware that supports OpenCL. This includes hardware from different vendors like Nvidia, AMD, and Intel.
BUILD_TYPE=cublas
## Uncomment and set to true to enable rebuilding from source
REBUILD=true
## Enable go tags, available: stablediffusion, tts
## stablediffusion: image generation with stablediffusion
## tts: enables text-to-speech with go-piper
## (requires REBUILD=true)
#
# GO_TAGS=stablediffusion
## Path where to store generated images
# IMAGE_PATH=/tmp
## Specify a default upload limit in MB (whisper)
# UPLOAD_LIMIT
## List of external GRPC backends (note on the container image this variable is already set to use extra backends available in extra/)
# EXTERNAL_GRPC_BACKENDS=my-backend:127.0.0.1:9000,my-backend2:/usr/bin/backend.py
### Advanced settings ###
### Those are not really used by LocalAI, but from components in the stack ###
##
### Preload libraries
# LD_PRELOAD=
### Huggingface cache for models
# HUGGINGFACE_HUB_CACHE=/usr/local/huggingface
### Python backends GRPC max workers
### Default number of workers for GRPC Python backends.
### This actually controls wether a backend can process multiple requests or not.
# PYTHON_GRPC_MAX_WORKERS=1
4. Edit dockercompose file
version: '3.6'
services:
api:
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
image: quay.io/go-skynet/local-ai:master-cublas-cuda12
tty: true # enable colorized logs
restart: always # should this be on-failure ?
ports:
- 8080:8080
env_file:
- .env
volumes:
- ./models:/models
command: ["/usr/bin/local-ai" ]
- Run command: docker-compose up -d --pull always
Hope that helpfull.
Thank you and have a nice day.
Hello, I also had a problem when using gpu version. Have you solved your problem?
I am having the same issue, despite having gpu set in docker-compose and setting gpu_layers in the yaml. Could it be a docker issue?
Hello, I also had a problem when using gpu version. Have you solved your problem?
yes. The comment above show the way fixing my problem.
