RunPod-Fooocus-API Artifacts during text2img generation

I have an endpoint with a custom docker image and I use it to generate images using the Fooocus API. Loras, checkpoints, and so on are installed in this docker image. And my work is done in such a way that many requests are sent to the endpoint queue at one time and the first requests are executed correctly, but over time, by +- 20 request, I start to receive artifacts as in the images below: P.s I censored the images with the girls, but the emphasis there was that the images usually turn out to be realistic, but here they are more cartoonish.

So the problem is that either the graphics stop being realistic, or something else happens altogether, whether it's some kind of monsters or a picture with white noise. I didn't find any errors in the logs, I tried to improve the number of GPUs, but it didn't help. I will be glad of any help.

The information that is worth knowing is that there is never such a thing in the generation of the first requests, so one gets the feeling that the Fooocus API simply "gets tired" and because of this generates images of this quality.

May 29 '25 15:05 EvgeniyWis

Didn't you add the ENV PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True you mentioned in the memory issue? That could definitely worsen memory leaks, fragmentation and something described as "ghosting" or quality degradation over time in diffusion models. When using many LoRAs or even switching between them and models, we increase the chance of this happening. The AI frameworks (both img and LLMs) usually try to cache as many things as they can, so they don't have to load everything from 0 with each job. This improves the user experience if we reuse a lot of the same things over and over. But can hurt the system if we're not, and rather try to generate a lot of diverse outputs after each other.

With the issues you posted so far, I've noticed you're probably trying to push the container in terms of size and what it can do, right? You're using 40 and 80GB cards yet running out of memory. And now you mentioned your image has checkpointS, in plural? Meaning you're switching multiple ones? If that's the case, you would want to dive deeper into the Fooocus and handler.py and try to turn down the caching and re-using as much as possible or, handle warm/cold workers more by yourself or even turn off the Flashboot completely. But that will also hurt the performance. So what I would recommend more, is to try to keep the image lean and create multiple endpoints for different models/workflows rather than a single huge one for everything.

We've been using such images on 4090 for a long time, and while I did sometimes observe a bit of style adherence refusal between different requests both there or locally, I never had OOM or such degradation. Even now I tested it with all the API endpoints Fooocus has again, to try to load all the possible control models into the memory of a single worker, and the quality was the same after 20+ requests.

This is also why it's hard for me to reproduce it and help you if your image looks very different from the default one. I would need to know a lot more details about what you're trying to achieve and everything you've changed, and even then, I still wouldn't be as qualified to solve it as the developers of the original projects I just made RunPod implementations for.

May 29 '25 22:05 davefojtik

The distribution of Loras and checkpoints across several endpoints helped me, but most of all, setting up a 24GB GPU and RTX 4090 for each endpoint. Thank you very much!

Jun 07 '25 11:06 EvgeniyWis

Still, from time to time I still get similar images that don't match them at all:

Moreover, I even created my own repository, in which I removed any caching of loras and checkpoints, but even this did not help. Therefore, I'm not sure, but maybe it makes sense to open this issue again? Or maybe I can somehow influence it? Or is this even the area of responsibility of FooocusAPI? This is not a very common problem, but I think it was worth reporting it.

Jul 25 '25 18:07 EvgeniyWis