sensorium icon indicating copy to clipboard operation
sensorium copied to clipboard

Loading a pre-trained model core to get responses to images

Open dp4846 opened this issue 1 year ago • 6 comments

Is there a straightforward way to get a pre-trained model core from which I can put in images (numpy arrays) and get responses from the feature map?

I have yet to figure out how to do so. I have managed to install the code. I looked at the 0_baseline_CNN.ipynb and it appears to get a baseline model requires training the model. For the sake of saving electricity and time I was hoping to get pre-trained weights and load it into the baseline model.

Ideally, I would bypass the custom data loaders and the shifter network---if I could do this outside of the docker enviroment that would be the cherry on top. Sorry in advance if the answer was obvious!

dp4846 avatar Jan 17 '25 17:01 dp4846

Hi there! thank you for your questions.

Yes it is possible to load the pretrained model. Have a look here: https://github.com/sinzlab/sensorium/blob/main/notebooks/model_tutorial/2_model_evaluation_and_inspection.ipynb

you can just run this:

model.load_state_dict(torch.load("./model_checkpoints/pretrained/generalization_model.pth"));
model.cuda().eval();

# to get the embedded features in response to some images
some_input_image = torch.ones(10, 1, 36, 64).cuda()
model_features = model.core(some_input_image) # these are the features that the core outputs

# to get the neuronal responses of the trained model
responses = model(some_input_image, data_key='23964-4-22')

# the data key corresponds to the recording session. there are 7 sessions. to see the different data_keys, just print(model.readout)

You'd need the docker environment unfortunately. It'd be easiest to use the docker image and to download the data. other environments with python <=3.9 (maybe <=3.8) should also work, as long as they have pytorch and these requirements from the setup.py

install_requires=[
        "neuralpredictors==0.3.0",
        "nnfabrik==0.2.1",
        "scikit-image>=0.19.1",
        "lipstick",
        "numpy>=1.22.0",
    ],

KonstantinWilleke avatar Jan 17 '25 18:01 KonstantinWilleke

Thanks for the quick response! I followed code in the notebook you reccomended but I was unable to load the initial model:

model = get_model(model_fn=model_fn,
                  model_config=model_config,
                  dataloaders=dataloaders,
                  seed=42,)

I get the error

    170         # This function throws if there's a driver initialization error, no GPUs
    171         # are found or any other error occurs
--> 172         torch._C._cuda_init()
    173         # Some of the queued calls may reentrantly call _lazy_init();
    174         # we need to just return without initializing in that case.

RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx 

Do I need an NVIDIA GPU in order to run the model? Is there a switch I can throw to just run on the CPU?

dp4846 avatar Jan 18 '25 03:01 dp4846

Oh yes, by default all parts of the model and dataloaders are transferred to a GPU. But you can disable it by adding "cuda": False to the dataset_config.

so this would be your whole dataset_config

dataset_config = {'paths': filenames,
                 'normalize': True,
                 'include_behavior': False,
                 'include_eye_position': False,
                 'batch_size': 128,
                 'cuda': False,
                 'scale':.25,
                 }

that should do the trick. everything else will be on the cpu from that point on and you can run model inference on the cpu.

In the model evaluation notebook, if you want to recreate the plots, you have set all the device arguments to cpu instead of cuda

KonstantinWilleke avatar Jan 19 '25 11:01 KonstantinWilleke

Thanks for all your help. I believe I am close now. I am working through https://github.com/sinzlab/sensorium/blob/main/notebooks/model_tutorial/2_model_evaluation_and_inspection.ipynb

Including 'cuda':false in model_config allowed me to evaluate: model = get_model(model_fn=model_fn, model_config=model_config, dataloaders=dataloaders, seed=42) without error.

But when I tried to load the pre-trained weight:

model.load_state_dict(torch.load("./model_checkpoints/pretrained/generalization_model.pth", map_location=torch.device('cpu')));

I get a error having to do missing keys in loading the state_dict:

RuntimeErrorTraceback (most recent call last) <ipython-input-5-21f9d8165062> in <module> ----> 1 model.load_state_dict(torch.load("./model_checkpoints/pretrained/generalization_model.pth", map_location=torch.device('cpu'))); 2 model.eval(); 3 4 # to get the embedded features in response to some images 5 some_input_image = torch.ones(10, 1, 36, 64).cuda() /usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict) 1049 1050 if len(error_msgs) > 0: -> 1051 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( 1052 self.__class__.__name__, "\n\t".join(error_msgs))) 1053 return _IncompatibleKeys(missing_keys, unexpected_keys) RuntimeError: Error(s) in loading state_dict for FiringRateEncoder: Missing key(s) in state_dict: "core.features.layer1.ds_conv.in_depth_conv.bias", "core.features.layer1.ds_conv.spatial_conv.bias", "core.features.layer1.ds_conv.out_depth_conv.bias", "core.features.layer2.ds_conv.in_depth_conv.bias", "core.features.layer2.ds_conv.spatial_conv.bias", "core.features.layer2.ds_conv.out_depth_conv.bias", "core.features.layer3.ds_conv.in_depth_conv.bias", "core.features.layer3.ds_conv.spatial_conv.bias", "core.features.layer3.ds_conv.out_depth_conv.bias". Unexpected key(s) in state_dict: "readout.23964-4-22.sigma", "readout.23964-4-22._features", "readout.23964-4-22.bias", "readout.23964-4-22.source_grid", "readout.23964-4-22.mu_transform.0.weight", "readout.23964-4-22.mu_transform.0.bias", "readout.23964-4-22.mu_transform.2.weight", "readout.23964-4-22.mu_transform.2.bias", "readout.22846-10-16.sigma", "readout.22846-10-16._features", "readout.22846-10-16.bias", "readout.22846-10-16.source_grid", "readout.22846-10-16.mu_transform.0.weight", "readout.22846-10-16.mu_transform.0.bias", "readout.22846-10-16.mu_transform.2.weight", "readout.22846-10-16.mu_transform.2.bias", "readout.26872-17-20.sigma", "readout.26872-17-20._features", "readout.26872-17-20.bias", "readout.26872-17-20.source_grid", "readout.26872-17-20.mu_transform.0.weight", "readout.26872-17-20.mu_transform.0.bias", "readout.26872-17-20.mu_transform.2.weight", "readout.26872-17-20.mu_transform.2.bias", "readout.23343-5-17.sigma", "readout.23343-5-17._features", "readout.23343-5-17.bias", "readout.23343-5-17.source_grid", "readout.23343-5-17.mu_transform.0.weight", "readout.23343-5-17.mu_transform.0.bias", "readout.23343-5-17.mu_transform.2.weight", "readout.23343-5-17.mu_transform.2.bias", "readout.27204-5-13.sigma", "readout.27204-5-13._features", "readout.27204-5-13.bias", "readout.27204-5-13.source_grid", "readout.27204-5-13.mu_transform.0.weight", "readout.27204-5-13.mu_transform.0.bias", "readout.27204-5-13.mu_transform.2.weight", "readout.27204-5-13.mu_transform.2.bias", "readout.23656-14-22.sigma", "readout.23656-14-22._features", "readout.23656-14-22.bias", "readout.23656-14-22.source_grid", "readout.23656-14-22.mu_transform.0.weight", "readout.23656-14-22.mu_transform.0.bias", "readout.23656-14-22.mu_transform.2.weight", "readout.23656-14-22.mu_transform.2.bias". Should I be loading different model weights?

If it is helpful here is the model description that got printed:

FiringRateEncoder( (core): Stacked2dCore( (_input_weights_regularizer): LaplaceL2norm( (laplace): Laplace() ) (features): Sequential( (layer0): Sequential( (conv): Conv2d(1, 64, kernel_size=(9, 9), stride=(1, 1), bias=False) (norm): BatchNorm2d(64, eps=1e-05, momentum=0.9, affine=True, track_running_stats=True) (nonlin): AdaptiveELU() ) (layer1): Sequential( (ds_conv): DepthSeparableConv2d( (in_depth_conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1)) (spatial_conv): Conv2d(64, 64, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=64) (out_depth_conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1)) ) (norm): BatchNorm2d(64, eps=1e-05, momentum=0.9, affine=True, track_running_stats=True) (nonlin): AdaptiveELU() ) (layer2): Sequential( (ds_conv): DepthSeparableConv2d( (in_depth_conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1)) (spatial_conv): Conv2d(64, 64, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=64) (out_depth_conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1)) ) (norm): BatchNorm2d(64, eps=1e-05, momentum=0.9, affine=True, track_running_stats=True) (nonlin): AdaptiveELU() ) (layer3): Sequential( (ds_conv): DepthSeparableConv2d( (in_depth_conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1)) (spatial_conv): Conv2d(64, 64, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=64) (out_depth_conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1)) ) (norm): BatchNorm2d(64, eps=1e-05, momentum=0.9, affine=True, track_running_stats=True) (nonlin): AdaptiveELU() ) ) ) [Stacked2dCore regularizers: gamma_hidden = 0|gamma_input = 6.3831|skip = 0] (readout): MultipleFullGaussian2d( (21067-10-18): full FullGaussian2d (64 x 28 x 56 -> 8372) with bias, with predicted grid -> Sequential( (0): Linear(in_features=2, out_features=30, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=30, out_features=2, bias=True) (3): Tanh() ) ) )

dp4846 avatar Jan 19 '25 15:01 dp4846

I'm sorry for all of these issues! Your dataloader is missing some data_keys. Could it be that you only added one dataset to the dataset config, and not all of them?

There have been a few very subtle changes to the core architecture, which makes the loaded weights also a bit different, but the impact of this should be negligible.

you can solve both issues though simply by running:

model.load_state_dict(torch.load("./model_checkpoints/pretrained/generalization_model.pth", 
                                 map_location=torch.device('cpu'),
                                 strict=False,
                                )
                     );

hope it'll work now! you should then also be able to compute the accuracy metrics from that notebook. let me know if the values that you are getting for one or more datasets are comparable.

KonstantinWilleke avatar Jan 19 '25 19:01 KonstantinWilleke

No worries at all! Thanks for helping me work through them!

Woohoo! It works!

I did have to change your recommended code a little, strict=False needed to be an argument into the load_state_dict function (instead of an argument to torch.load):

model.load_state_dict(torch.load("./model_checkpoints/pretrained/generalization_model.pth", map_location=torch.device('cpu'),), strict=False,);

and then I needed to change the data_key to the specific data_key of the dataset I had downloaded--I found this from print(model.readout)

You were correct, I had only loaded one dataset. Does that matter for using the pre-trained SOTA model? It does seem odd to require a dataset to run a model ...

Otherwise thanks for all the help! I am happy to summarize what I needed to do to load and run the SOTA model without a GPU and a single dataset if that would be helpful.

dp4846 avatar Jan 19 '25 20:01 dp4846