ControlCap icon indicating copy to clipboard operation
ControlCap copied to clipboard

environment

Open ghost opened this issue 8 months ago • 12 comments

1)请问环境配置中有的包需要numpy大于2.0,而python==3.8只能安装numpy==1.24,请问可以升级到python==3.10吗? 2)请问模型是不是需要A100才能配置运行,4090架构模型某些代码运行不了?

ghost avatar May 05 '25 11:05 ghost

你好!。1. 应该没问题的,环境和LAVIS的环境基本一致。2. 4090可以运行的,但是训练按照默认config应该显存不太够,需要降低batch size。

callsys avatar May 05 '25 11:05 callsys

你好!。1. 应该没问题的,环境和LAVIS的环境基本一致。2. 4090可以运行的,但是训练按照默认config应该显存不太够,需要降低batch size。

谢谢回复,所以请问还是建议使用python==3.8环境吗?

ghost avatar May 05 '25 11:05 ghost

嗯嗯,LAVIS也是python3.8环境,也经过我们测试。但更高版本的python环境应该影响也不大。

callsys avatar May 05 '25 11:05 callsys

嗯嗯,LAVIS也是python3.8环境,也经过我们测试。但更高版本的python环境应该影响也不大。

感谢您的回复!请问方便conda list给出所有包的版本吗,万分感谢!

ghost avatar May 05 '25 12:05 ghost

This file may be used to create an environment using:

$ conda create --name --file

platform: linux-64

_libgcc_mutex=0.1=main _openmp_mutex=5.1=1_gnu accelerate=0.30.1=pypi_0 altair=5.3.0=pypi_0 annotated-types=0.7.0=pypi_0 antlr4-python3-runtime=4.9.3=pypi_0 asttokens=2.4.1=pypi_0 attrs=23.2.0=pypi_0 backcall=0.2.0=pypi_0 bleach=6.1.0=pypi_0 blinker=1.8.2=pypi_0 blis=0.7.11=pypi_0 braceexpand=0.1.7=pypi_0 ca-certificates=2024.3.11=h06a4308_0 cachetools=5.3.3=pypi_0 catalogue=2.0.10=pypi_0 certifi=2024.2.2=pypi_0 cfgv=3.4.0=pypi_0 charset-normalizer=3.3.2=pypi_0 click=8.1.7=pypi_0 cloudpathlib=0.16.0=pypi_0 confection=0.1.4=pypi_0 contexttimer=0.3.3=pypi_0 contourpy=1.1.1=pypi_0 cycler=0.12.1=pypi_0 cymem=2.0.8=pypi_0 dcnv4=1.0.0.post2=pypi_0 decorator=5.1.1=pypi_0 decord=0.6.0=pypi_0 distlib=0.3.8=pypi_0 einops=0.8.0=pypi_0 en-core-web-sm=3.7.1=pypi_0 executing=2.0.1=pypi_0 fairscale=0.4.4=pypi_0 filelock=3.14.0=pypi_0 fonttools=4.52.1=pypi_0 fsspec=2024.5.0=pypi_0 ftfy=6.2.0=pypi_0 gitdb=4.0.11=pypi_0 gitpython=3.1.43=pypi_0 huggingface-hub=0.23.1=pypi_0 identify=2.5.36=pypi_0 idna=3.7=pypi_0 imageio=2.34.1=pypi_0 importlib-resources=6.4.0=pypi_0 iopath=0.1.10=pypi_0 ipython=8.12.3=pypi_0 jedi=0.19.1=pypi_0 jinja2=3.1.4=pypi_0 joblib=1.4.2=pypi_0 jsonschema=4.22.0=pypi_0 jsonschema-specifications=2023.12.1=pypi_0 kaggle=1.6.14=pypi_0 kiwisolver=1.4.5=pypi_0 langcodes=3.4.0=pypi_0 language-data=1.2.0=pypi_0 lazy-loader=0.4=pypi_0 ld_impl_linux-64=2.38=h1181459_1 libffi=3.4.4=h6a678d5_1 libgcc-ng=11.2.0=h1234567_1 libgomp=11.2.0=h1234567_1 libstdcxx-ng=11.2.0=h1234567_1 marisa-trie=1.1.1=pypi_0 markdown-it-py=3.0.0=pypi_0 markupsafe=2.1.5=pypi_0 matplotlib=3.7.5=pypi_0 matplotlib-inline=0.1.7=pypi_0 mdurl=0.1.2=pypi_0 mpmath=1.3.0=pypi_0 murmurhash=1.0.10=pypi_0 ncurses=6.4=h6a678d5_0 networkx=3.1=pypi_0 nltk=3.8.1=pypi_0 nodeenv=1.8.0=pypi_0 numpy=1.24.4=pypi_0 nvidia-cublas-cu12=12.1.3.1=pypi_0 nvidia-cuda-cupti-cu12=12.1.105=pypi_0 nvidia-cuda-nvrtc-cu12=12.1.105=pypi_0 nvidia-cuda-runtime-cu12=12.1.105=pypi_0 nvidia-cudnn-cu12=8.9.2.26=pypi_0 nvidia-cufft-cu12=11.0.2.54=pypi_0 nvidia-curand-cu12=10.3.2.106=pypi_0 nvidia-cusolver-cu12=11.4.5.107=pypi_0 nvidia-cusparse-cu12=12.1.0.106=pypi_0 nvidia-nccl-cu12=2.20.5=pypi_0 nvidia-nvjitlink-cu12=12.5.40=pypi_0 nvidia-nvtx-cu12=12.1.105=pypi_0 omegaconf=2.3.0=pypi_0 opencv-python-headless=4.5.5.64=pypi_0 opendatasets=0.1.22=pypi_0 openssl=3.0.13=h7f8727e_2 packaging=24.0=pypi_0 pandas=2.0.3=pypi_0 parso=0.8.4=pypi_0 peft=0.8.2=pypi_0 pexpect=4.9.0=pypi_0 pickleshare=0.7.5=pypi_0 pillow=10.3.0=pypi_0 pip=24.0=py38h06a4308_0 pkgutil-resolve-name=1.3.10=pypi_0 platformdirs=4.2.2=pypi_0 plotly=5.22.0=pypi_0 portalocker=2.8.2=pypi_0 pre-commit=3.5.0=pypi_0 preshed=3.0.9=pypi_0 prompt-toolkit=3.0.43=pypi_0 protobuf=4.25.3=pypi_0 psutil=5.9.8=pypi_0 ptyprocess=0.7.0=pypi_0 pure-eval=0.2.2=pypi_0 pyarrow=16.1.0=pypi_0 pycocoevalcap=1.2=pypi_0 pycocotools=2.0.7=pypi_0 pydantic=2.7.1=pypi_0 pydantic-core=2.18.2=pypi_0 pydeck=0.9.1=pypi_0 pygments=2.18.0=pypi_0 pyparsing=3.1.2=pypi_0 python=3.8.19=h955ad1f_0 python-dateutil=2.9.0.post0=pypi_0 python-magic=0.4.27=pypi_0 python-slugify=8.0.4=pypi_0 pytz=2024.1=pypi_0 pywavelets=1.4.1=pypi_0 pyyaml=6.0.1=pypi_0 readline=8.2=h5eee18b_0 referencing=0.35.1=pypi_0 regex=2024.5.15=pypi_0 requests=2.32.2=pypi_0 rich=13.7.1=pypi_0 rpds-py=0.18.1=pypi_0 safetensors=0.4.3=pypi_0 salesforce-lavis=1.0.2=pypi_0 scenegraphparser=0.1.0=pypi_0 scikit-image=0.21.0=pypi_0 scikit-learn=1.3.2=pypi_0 scipy=1.10.1=pypi_0 seaborn=0.13.2=pypi_0 sentencepiece=0.2.0=pypi_0 setuptools=69.5.1=py38h06a4308_0 six=1.16.0=pypi_0 smart-open=6.4.0=pypi_0 smmap=5.0.1=pypi_0 spacy=3.7.4=pypi_0 spacy-legacy=3.0.12=pypi_0 spacy-loggers=1.0.5=pypi_0 sqlite=3.45.3=h5eee18b_0 srsly=2.4.8=pypi_0 stack-data=0.6.3=pypi_0 streamlit=1.35.0=pypi_0 sympy=1.12=pypi_0 tabulate=0.9.0=pypi_0 tenacity=8.3.0=pypi_0 text-unidecode=1.3=pypi_0 textblob=0.17.1=pypi_0 thinc=8.2.3=pypi_0 threadpoolctl=3.5.0=pypi_0 tifffile=2023.7.10=pypi_0 timm=0.4.12=pypi_0 tk=8.6.14=h39e8969_0 tokenizers=0.13.3=pypi_0 toml=0.10.2=pypi_0 toolz=0.12.1=pypi_0 torch=2.1.2+cu121=pypi_0 torchaudio=2.1.2+cu121=pypi_0 torchvision=0.16.2+cu121=pypi_0 tornado=6.4=pypi_0 tqdm=4.66.4=pypi_0 traitlets=5.14.3=pypi_0 transformers=4.26.1=pypi_0 triton=2.1.0=pypi_0 typer=0.9.4=pypi_0 typing-extensions=4.12.0=pypi_0 tzdata=2024.1=pypi_0 urllib3=2.2.1=pypi_0 virtualenv=20.26.2=pypi_0 wasabi=1.1.2=pypi_0 watchdog=4.0.1=pypi_0 wcwidth=0.2.13=pypi_0 weasel=0.3.4=pypi_0 webdataset=0.2.86=pypi_0 webencodings=0.5.1=pypi_0 wheel=0.43.0=py38h06a4308_0 xz=5.4.6=h5eee18b_1 zipp=3.18.2=pypi_0 zlib=1.2.13=h5eee18b_1

callsys avatar May 05 '25 13:05 callsys

Image您好,感谢之前的解答!关于这个data.sh运行所需要的这三个文件,最下面的二个都可以下载,但是最上面缺失的data/vg/annotations/vg1.0/densecap_splits.json和data/vg/annotations/vg1.2/densecap_splits.json好像找不到了?

ghost avatar May 07 '25 08:05 ghost

你好,我们上传了converted annotations,可以直接这个转化标注训练,不需要运行data.sh。

callsys avatar May 07 '25 08:05 callsys

====== Model Attributes ====== 2025-05-11 16:52:59,415 [INFO] { "apply_lemmatizer": false, "arch": "controlcap_t5", "do_sample": false, "drop_path_rate": 0, "finetune_llm": false, "first_word_control": false, "freeze_vit": true, "img_size": 224, "length_penalty": 0, "max_new_tokens": 20, "max_txt_len": 32, "min_length": 1, "num_beams": 2, "num_query_token": 32, "num_return_sequences": 1, "pretrained": "https://storage.googleapis.com/sfr-vision-language-research/LAVIS/models/BLIP2/blip2_pretrained_flant5xl.pth", "repetition_penalty": 1.5, "t5_model": "google/flan-t5-xl", "tag_bert_config": "controlcap/models/tagging_heads/tag_bert_config.json", "tag_list": "controlcap/common/tagging/ram_tag_list.txt", "tag_thr": 0.7, "temperature": 1, "top_p": 0.9, "use_grad_checkpoint": false, "vit_model": "eva_clip_g", "vit_precision": "fp16" } 2025-05-11 16:52:59,416 [INFO] Building datasets... loading annotations into memory... [2025-05-11 16:59:08,802] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2243255 closing signal SIGTERM [2025-05-11 16:59:08,808] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2243257 closing signal SIGTERM [2025-05-11 16:59:08,808] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2243258 closing signal SIGTERM [2025-05-11 16:59:09,876] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: -9) local_rank: 1 (pid: 2243256) of binary: /data/xsf/anaconda3/envs/controlcap/bin/python Traceback (most recent call last): File "/data/xsf/anaconda3/envs/controlcap/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/data/xsf/anaconda3/envs/controlcap/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/data/xsf/anaconda3/envs/controlcap/lib/python3.8/site-packages/torch/distributed/run.py", line 810, in main() File "/data/xsf/anaconda3/envs/controlcap/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper return f(*args, **kwargs) File "/data/xsf/anaconda3/envs/controlcap/lib/python3.8/site-packages/torch/distributed/run.py", line 806, in main run(args) File "/data/xsf/anaconda3/envs/controlcap/lib/python3.8/site-packages/torch/distributed/run.py", line 797, in run elastic_launch( File "/data/xsf/anaconda3/envs/controlcap/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 134, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/data/xsf/anaconda3/envs/controlcap/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

train.py FAILED

Failures: <NO_OTHER_FAILURES>

Root Cause (first observed failure): [0]: time : 2025-05-11_16:59:08 host : user-Super-Server rank : 1 (local_rank: 1) exitcode : -9 (pid: 2243256) error_file: <N/A> traceback : Signal 9 (SIGKILL) received by PID 2243256

您好!请问我的实验设备是四张4090(24G),在运行到“loading annotations into memory...”时,服务器经常断开连接或则无法继续运行,请问是什么问题呢,上文是终端信息。希望能够得到您的解答,感谢!

ghost avatar May 11 '25 09:05 ghost

“loading annotations into memory...”是pycocotools中导入coco格式标注文件时候的打印信息。看起来是controlcap/datasets/dataset.py中46行COCO(ann_file)的命令卡住了,可以先单独debug这个部分,看看是什么原因。

callsys avatar May 11 '25 10:05 callsys

转化过的标注文件都是以coco格式保存的,可以试试这些标注文件是否可以正常被COCO读取。

callsys avatar May 11 '25 10:05 callsys

Image您好!请问这个文件assets/groundingdino_swint_ogc.pth好像没有在项目列表中,我需要到哪里下载呢

ghost avatar Jun 05 '25 03:06 ghost

你好,这个checkpoint在groundingdino的repo下能找到。但这部分代码似乎没有在项目使用到,似乎没有必要下载。。

callsys avatar Jun 06 '25 02:06 callsys