NeMo AttributeError: 'MegatronGPTModel' object has no attribute 'decoder'

Description

I am retraining a LLaMA3 model. Due to the limited size of my dataset, I attempted to use freeze_updates as referenced in the NVIDIA NeMo documentation. My configuration is as follows:

freeze_updates:
  enabled: true  # set to false if you want to disable freezing
  modules:   # list all of the modules you want to have freezing logic for
    decoder: 100

However, I encountered the following error:

AttributeError: 'MegatronGPTModel' object has no attribute 'decoder'

I also tried changing decoder to encoder or joint, but I still faced errors. I would like to ask how to properly configure this setting?

Additionally, within the NeMo framework, is it possible to freeze specific layers, such as only the attention layer? If so, how can I achieve this? Thanks!

Aug 03 '24 01:08 lianghsun

I'm curious if by retraining you might mean continued training? https://docs.nvidia.com/nemo-framework/user-guide/latest/llms/allmodels/continuetraining.html?highlight=continued%2520training#configure-continual-learning

Regarding your error, the decoder module will be MegatronGPTModel.model.decoder I believe. But you can check by inspecting the MegatronGPTModel

Aug 23 '24 22:08 ericharper

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

Sep 23 '24 01:09 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale.

Sep 30 '24 02:09 github-actions[bot]