FLAIR-1 icon indicating copy to clipboard operation
FLAIR-1 copied to clipboard

Mismatch between Hugging Face Pretrained Model Description and GitHub Implementation

Open rajaram6052150 opened this issue 1 year ago • 7 comments

I have noticed a discrepancy between the pretrained model mentioned on the Hugging Face website and its actual implementation available in the GitHub repository. Specifically:

Hugging Face Model Page: https://huggingface.co/IGNF/FLAIR-INC_rgbie_15cl_resnet34-unet/blob/main/FLAIR-INC_rgbie_15cl_resnet34-unet_weights.pth

Error: You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Traceback (most recent call last): File "\?\C:\Users\Rajaram\anaconda3\envs\GPU\Scripts\flair-detect-script.py", line 33, in sys.exit(load_entry_point('flair', 'console_scripts', 'flair-detect')()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "e:\gpu\unet\flair-1-main\src\zone_detect\main.py", line 143, in main sliced_dataframe, profile, resolution, model = prepare(config, device) ^^^^^^^^^^^^^^^^^^^^^^^ File "e:\gpu\unet\flair-1-main\src\zone_detect\main.py", line 116, in prepare model = load_model(config) ^^^^^^^^^^^^^^^^^^ File "e:\gpu\unet\flair-1-main\src\zone_detect\model.py", line 84, in load_model model.load_state_dict(state_dict=state_dict, strict=True) File "C:\Users\Rajaram\anaconda3\envs\GPU\Lib\site-packages\torch\nn\modules\module.py", line 2215, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for UperNetForSemanticSegmentation: Missing key(s) in state_dict: "backbone.embeddings.patch_embeddings.projection.weight", "backbone.embeddings.patch_embeddings.projection.bias", "back bone.embeddings.norm.weight", "backbone.embeddings.norm.bias", "backbone.encoder.layers.0.blocks.0.layernorm_before.weight", "backbone.encoder.layers.0.block s.0.layernorm_before.bias", "backbone.encoder.layers.0.blocks.0.attention.self.relative_position_bias_table", "backbone.encoder.layers.0.blocks.0.attention.s elf.relative_position_index", "backbone.encoder.layers.0.blocks.0.attention.self.query.weight", "backbone.encoder.layers.0.blocks.0.attention.self.query.bias ", "backbone.encoder.layers.0.blocks.0.attention.self.key.weight", "backbone.encoder.layers.0.blocks.0.attention.self.key.bias", "backbone.encoder.layers.0.b locks.0.attention.self.value.weight", "backbone.encoder.layers.0.blocks.0.attention.self.value.bias", "backbone.encoder.layers.0.blocks.0.attention.output.de nse.weight", "backbone.encoder.layers.0.blocks.0.attention.output.dense.bias", "backbone.encoder.layers.0.blocks.0.layernorm_after.weight", "backbone.encoder .layers.0.blocks.0.layernorm_after.bias", "backbone.encoder.layers.0.blocks.0.intermediate.dense.weight", "backbone.encoder.layers.0.blocks.0.intermediate.de nse.bias", "backbone.encoder.layers.0.blocks.0.output.dense.weight", "backbone.encoder.layers.0.blocks.0.output.dense.bias", "backbone.encoder.layers.0.block s.1.layernorm_before.weight", "backbone.encoder.layers.0.blocks.1.layernorm_before.bias", "backbone.encoder.layers.0.blocks.1.attention.self.relative_positio n_bias_table", "backbone.encoder.layers.0.blocks.1.attention.self.relative_position_index", "backbone.encoder.layers.0.blocks.1.attention.self.query.weight",

rajaram6052150 avatar Aug 09 '24 08:08 rajaram6052150

Hello @rajaram6052150 The pre-trained models available right now on our HF page have been trained with segmentation-models-pytorch. This might indeed not be clear for now, we will update the model cards and model names when releasing pre-trained models trained with HF. So for now your config file should use something like :

model_weights: ../FLAIR-INC_rgbie_15cl_resnet34-unet_weights.pth
model_framework: 
    model_provider: SegmentationModelsPytorch
    HuggingFace:
        org_model: 
    SegmentationModelsPytorch:
        encoder_decoder: resnet34_unet

agarioud avatar Aug 09 '24 08:08 agarioud

Hello @agarioud ,

Thank you for the clarification regarding the pre-trained models. However, I have checked the specified path in the GitHub repository, and it appears that the file FLAIR-INC_rgbie_15cl_resnet34-unet_weights.pth is not present there.

Could you please provide the correct path or a link to download the pre-trained model file? It would be very helpful for continuing with the setup.

Thank you!

Best regards, Rajaram

rajaram6052150 avatar Aug 09 '24 09:08 rajaram6052150

You had the right link : https://huggingface.co/IGNF/FLAIR-INC_rgbie_15cl_resnet34-unet/blob/main/FLAIR-INC_rgbie_15cl_resnet34-unet_weights.pth Once downloaded locally, you should ajust the 'model_weights' path from the config file to point it.

agarioud avatar Aug 09 '24 09:08 agarioud

Thank you for help sir @agarioud . Sorry to bother you again . I’ve successfully set up and run inference using the FLAIR-INC_rgbie_15cl_resnet34-unet model after addressing the issue with the model_weights path. However, the output images are all coming out completely black.

Here are the details of my setup and problem:

Setup: Model Weights Path: Correctly set to the downloaded weights file from Hugging Face. Input Image: Path: Set to a georeferenced raster image. Metadata: Driver: GTiff Dtype: uint8 Number of Bands: 5 Image Shape: (5, 512, 512) CRS: EPSG:2154 Configuration Parameters: Output Type: argmax Normalization: Custom Image Size for Detection: 512 Overlap Margin: 128 Issue: Observation: The output raster images are completely black for all processed images. Error Message: No errors are reported during execution, and the model successfully loads and performs inference without issues.

rajaram6052150 avatar Aug 09 '24 09:08 rajaram6052150

@rajaram6052150, the flair-detect command is meant to infer over a 'large' area with overlap in inferences. If you image is 512*512 and img_pixels_detection is also 512 but with 128 overlap it may yield some inconsistencies. Also, the output raster are 'black' but do they contain values ?

agarioud avatar Aug 09 '24 10:08 agarioud

Big thanks to you for the guidance sir @agarioud . It turns out that the black images do indeed contain values, representing different classes, and the segmentation results are accurate. Your suggestion helped us identify this.

I also wanted to ask: Will this implementation work with JPG or PNG images as well, or do I need to modify the code for those formats?

rajaram6052150 avatar Aug 09 '24 12:08 rajaram6052150

Glad it worked out @rajaram6052150, happy to help. flair-detect won't work with JPG/PNG images as it is meant to work with georeferenced inputs. I haven't planned to add support to this in the near future so you might wan't to either convert your inputs or indeed modify the code. Best,

agarioud avatar Aug 09 '24 13:08 agarioud