InternVL [Bug] LoRA finetune doesnt generate any output text

Checklist

[X] 1. I have searched related issues but cannot get the expected help.
[ ] 2. The bug has not been fixed in the latest version.
[ ] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

I finetuned the model InternVL-2B with LoRA and uploaded the model to hugging face by copying the missing scripts from original model.. i uploaded it on https://huggingface.co/shivavardhineedi/my_mini_internVL_lora

when i try to inference it... i get:

i see the model is merged model anyways... there is no adapter_config.json... so no need to handle with peft_model module... but why is this not generating any repsonse?

Reproduction

finetuned with 2nd_finetune documentation... and uploaded the output from work_dir to hugging face... but it doesnt work

Environment

# Use an official NVIDIA CUDA image as a base
FROM nvidia/cuda:11.8.0-devel-ubuntu20.04

# Set environment variables

# Set environment variables
ENV CONDA_ENV=internvl
ENV PATH /opt/conda/envs/$CONDA_ENV/bin:$PATH
ENV TZ=America/New_York

# Set the timezone
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone

# Install dependencies
RUN apt-get update && apt-get install -y \
    git \
    wget \
    build-essential \
    libgl1-mesa-glx \
    libglib2.0-0 \
    && apt-get clean

# Install Miniconda
RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /tmp/miniconda.sh && \
    /bin/bash /tmp/miniconda.sh -b -p /opt/conda && \
    rm /tmp/miniconda.sh && \
    /opt/conda/bin/conda clean -a && \
    ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh

# Clone the InternVL repository
RUN git clone https://github.com/OpenGVLab/InternVL.git /workspace/InternVL

# Create conda environment
RUN /opt/conda/bin/conda create -n $CONDA_ENV python=3.9 -y && \
    /opt/conda/bin/conda clean -a -y

# Set the working directory
WORKDIR /workspace/InternVL/internvl_chat


# RUN pip install flash-attn==2.3.6 --no-build-isolation

# Install Python packages using pip
RUN /bin/bash -c "source /opt/conda/etc/profile.d/conda.sh && conda activate $CONDA_ENV && \
    pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 torchaudio==2.0.2+cu118 --extra-index-url https://download.pytorch.org/whl/cu118 && \
    pip install packaging && \
    pip install flash-attn==2.3.6 --no-build-isolation && \
    pip install timm==0.9.12 && \
    pip install -U openmim && \
    pip install transformers==4.40.0 && \
    pip install -U "huggingface_hub[cli]" && \
    pip install opencv-python termcolor yacs pyyaml scipy deepspeed==0.13.5 pycocoevalcap tqdm pillow tensorboardX datasets mlfoundry truefoundry orjson peft sentencepiece"

RUN pip install -r /workspace/InternVL/requirements.txt

Error traceback

No response

Aug 11 '24 17:08 shiva-vardhineedi

I experienced the same issue finetuning the 40B variant - loading the resulting model would produce no outputs.

I had to run python tools/merge_lora.py on the finetuned model first. The docs do not make this part clear and I assume many other engineers familiar with LoRAs/peft are used to LoRAs being external small artifacts rather than being contained (but not merged?) in the finetuning output.

I then copied the *.py files to the merged folder (as per instructions), config.json was already present but I overrode it.

At this point I was able to generate captions aligned with my training data but they were prefixed with <s>, which I was able to fix by overwriting the rest *.json configs in the merged folder with the ones from the 40B model.

Aug 20 '24 01:08 AstraliteHeart

I experienced the same issue finetuning the 40B variant - loading the resulting model would produce no outputs.

I had to run python tools/merge_lora.py on the finetuned model first. The docs do not make this part clear and I assume many other engineers familiar with LoRAs/peft are used to LoRAs being external small artifacts rather than being contained (but not merged?) in the finetuning output.

I then copied the *.py files to the merged folder (as per instructions), config.json was already present but I overrode it.

At this point I was able to generate captions aligned with my training data but they were prefixed with <s>, which I was able to fix by overwriting the rest *.json configs in the merged folder with the ones from the 40B model.

Have you solved it? I experienced the same issue. Could you tell me the detail about solving it?

Sep 28 '24 17:09 theignorantt