[Bug]: Generate Captions does not save files

Open tk421storm opened this issue 10 months ago • 0 comments

What happened?

Using OneTrainer with a collection of PNG files labelled (source.####.png) - attempting to run "Generate Captions" tool. No matter what settings I choose, no text files are output in the directory. In the terminal, I can see the models download and the caption process complete (gui says 21/21 complete) however there are no files in the directory alongside the images.

If I manually enter a caption under the window and press enter, it succesfully creates a text file alongside.

What did you expect would happen?

After automatic caption generation, caption files would appear alongside images.

Relevant log output

loading Blip2 model, this may take a while
processor_config.json: 100%|████████████████████████████████████████████████████████████████| 68.0/68.0 [00:00<?, ?B/s]
preprocessor_config.json: 100%|███████████████████████████████████████████████████████████████| 432/432 [00:00<?, ?B/s]
tokenizer_config.json: 100%|██████████████████████████████████████████████████████████████████| 882/882 [00:00<?, ?B/s]
vocab.json: 798kB [00:00, 8.15MB/s]
merges.txt: 456kB [00:00, 7.87MB/s]
tokenizer.json: 3.56MB [00:00, 19.9MB/s]
added_tokens.json: 100%|████████████████████████████████████████████████████████████████████| 23.0/23.0 [00:00<?, ?B/s]
special_tokens_map.json: 100%|████████████████████████████████████████████████████████████████| 548/548 [00:00<?, ?B/s]
config.json: 1.03kB [00:00, 1.03MB/s]
model.safetensors.index.json: 122kB [00:00, 24.3MB/s]
model-00002-of-00002.safetensors: 100%|███████████████████████████████████████████| 4.98G/4.98G [01:06<00:00, 74.8MB/s]
model-00001-of-00002.safetensors: 100%|███████████████████████████████████████████| 10.0G/10.0G [02:27<00:00, 67.7MB/s]
Fetching 2 files: 100%|██████████████████████████████████████████████████████████████████| 2/2 [02:28<00:00, 74.18s/it]
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 2/2 [00:03<00:00,  1.76s/it]
generation_config.json: 100%|█████████████████████████████████████████████████████████████████| 141/141 [00:00<?, ?B/s]
100%|██████████████████████████████████████████████████████████████████████████████████| 21/21 [00:30<00:00,  1.44s/it]
loading Blip model, this may take a while
100%|██████████████████████████████████████████████████████████████████████████████████| 21/21 [00:30<00:00,  1.44s/it]
loading Blip model, this may take a while
100%|██████████████████████████████████████████████████████████████████████████████████| 21/21 [00:30<00:00,  1.45s/it]

Generate and upload debug_report.log

=== System Information === OS: Windows 10 Version: 10.0.19045

=== Hardware Information === CPU: Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz (Cores: 8) Total RAM: 63.91 GB

=== GPU Information === AMD Radeon RX 7900 XTX: AMD Radeon RX 7900 XTX [Advanced Micro Devices, Inc.] Driver Version: 32.0.21001.9024

=== Python Environment === Global Python Version: 3.11.9 Python Executable Path: G:\AI\OneTrainer\venv\Scripts\python.exe PyTorch Info: torch==2.7.1+cu128 pip freeze output: absl-py==2.3.0 accelerate==1.7.0 aiodns==3.5.0 aiohappyeyeballs==2.6.1 aiohttp==3.12.13 aiohttp-retry==2.9.1 aiosignal==1.3.2 annotated-types==0.7.0 antlr4-python3-runtime==4.9.3 anyio==4.9.0 attrs==25.3.0 av==14.4.0 backoff==2.2.1 bcrypt==4.3.0 bitsandbytes==0.46.0 boto3==1.38.46 botocore==1.38.46 Brotli==1.1.0 certifi==2025.6.15 cffi==1.17.1 charset-normalizer==3.4.2 click==8.2.1 cloudpickle==3.1.1 colorama==0.4.6 coloredlogs==15.0.1 contourpy==1.3.2 cryptography==45.0.4 customtkinter==5.2.2 cycler==0.12.1 dadaptation==3.2 darkdetect==0.8.0 decorator==5.2.1 Deprecated==1.2.18 -e git+https://github.com/huggingface/diffusers.git@73a9d5856f2d7ae3637c484d83cd697284ad3962#egg=diffusers dnspython==2.7.0 email_validator==2.2.0 fabric==3.2.2 fastapi==0.115.14 fastapi-cli==0.0.7 filelock==3.18.0 flatbuffers==25.2.10 fonttools==4.58.4 frozenlist==1.7.0 fsspec==2025.5.1 ftfy==6.3.1 grpcio==1.73.1 h11==0.16.0 httpcore==1.0.9 httptools==0.6.4 httpx==0.28.1 huggingface-hub==0.32.4 humanfriendly==10.0 idna==3.10 imagesize==1.4.1 importlib_metadata==8.7.0 inquirerpy==0.3.4 invisible-watermark==0.2.0 invoke==2.2.0 itsdangerous==2.2.0 Jinja2==3.1.6 jmespath==1.0.1 kiwisolver==1.4.8 lightning-utilities==0.14.3 lion-pytorch==0.2.3 Markdown==3.8.2 markdown-it-py==3.0.0 MarkupSafe==3.0.2 matplotlib==3.10.3 mdurl==0.1.2 -e git+https://github.com/Nerogar/mgds.git@11ff4aa33fff4614bfd835076eabed753a845d83#egg=mgds mpmath==1.3.0 multidict==6.6.3 networkx==3.5 numpy==2.2.6 nvidia-ml-py==12.575.51 omegaconf==2.3.0 -e git+https://github.com/Open-Model-Initiative/OMI-Model-Standards.git@4ad235ceba6b42a97942834b7664379e4ec2d93c#egg=omi_model_standards onnxruntime-gpu==1.22.0 open_clip_torch==2.32.0 opencv-python==4.11.0.86 orjson==3.10.18 packaging==25.0 paramiko==3.5.1 pfzy==0.3.4 pillow==11.2.1 platformdirs==4.3.8 pooch==1.8.2 prettytable==3.16.0 prodigy-plus-schedule-free==1.9.2 prodigyopt==1.1.2 prompt_toolkit==3.0.51 propcache==0.3.2 protobuf==6.31.1 psutil==7.0.0 py-cpuinfo==9.0.0 pycares==4.9.0 pycparser==2.22 pydantic==2.11.7 pydantic-extra-types==2.10.5 pydantic-settings==2.10.1 pydantic_core==2.33.2 Pygments==2.19.2 PyNaCl==1.5.0 pyparsing==3.2.3 pyreadline3==3.5.4 python-dateutil==2.9.0.post0 python-dotenv==1.1.1 python-multipart==0.0.20 pytorch-lightning==2.5.1.post0 pytorch_optimizer==3.6.0 PyWavelets==1.8.0 PyYAML==6.0.2 regex==2024.11.6 requests==2.32.3 rich==14.0.0 rich-toolkit==0.14.7 runpod==1.7.10 s3transfer==0.13.0 safetensors==0.5.3 scalene==1.5.51 scenedetect==0.6.6 schedulefree==1.4.1 scipy==1.15.3 sentencepiece==0.2.0 shellingham==1.5.4 six==1.17.0 sniffio==1.3.1 starlette==0.46.2 sympy==1.14.0 tensorboard==2.19.0 tensorboard-data-server==0.7.2 timm==1.0.16 tokenizers==0.21.2 tomli==2.2.1 tomlkit==0.13.3 torch==2.7.1+cu128 torchmetrics==1.7.3 torchvision==0.22.1+cu128 tqdm==4.67.1 tqdm-loggable==0.2 transformers==4.52.4 typer==0.16.0 typing-inspection==0.4.1 typing_extensions==4.14.0 ujson==5.10.0 urllib3==2.5.0 uvicorn==0.35.0 watchdog==6.0.0 watchfiles==1.1.0 wcwidth==0.2.13 websockets==15.0.1 Werkzeug==3.1.3 wrapt==1.17.2 yarl==1.20.1 yt-dlp==2025.6.25 zipp==3.23.0

=== Git Information === Repo: Nerogar/OneTrainer Branch: master Commit: 4ebc62d762d464729bc779219a9c8d39d73ecd56 Untracked Files: .zluda/cublas.dll .zluda/cublas64_11.dll .zluda/cudart.dll .zluda/cufft.dll .zluda/cusparse.dll .zluda/cusparse64_11.dll .zluda/nccl.dll .zluda/nvcuda.dll .zluda/nvml.dll .zluda/nvrtc.dll .zluda/nvrtc64_112_0.dll .zluda/zluda.exe .zluda/zluda_dump.dll .zluda/zluda_redirect.dll No modifications relative to upstream (origin/master).

=== Network Connectivity === PyPI (https://pypi.org/): Ping to pypi.org successful: Packet Loss: 0% HuggingFace (https://huggingface.co): Ping to huggingface.co successful: Packet Loss: 0% Google (https://www.google.com): Ping to www.google.com successful: Packet Loss: 0%

=== Intel Microcode Information === CPU is not detected as 13th or 14th Gen Intel - microcode info not applicable.

Jul 01 '25 16:07 tk421storm