AttributeError: 'MegatronGPTModel' object has no attribute 'decoder'
Description
I am retraining a LLaMA3 model. Due to the limited size of my dataset, I attempted to use freeze_updates as referenced in the NVIDIA NeMo documentation. My configuration is as follows:
freeze_updates:
enabled: true # set to false if you want to disable freezing
modules: # list all of the modules you want to have freezing logic for
decoder: 100
However, I encountered the following error:
AttributeError: 'MegatronGPTModel' object has no attribute 'decoder'
I also tried changing decoder to encoder or joint, but I still faced errors. I would like to ask how to properly configure this setting?
Additionally, within the NeMo framework, is it possible to freeze specific layers, such as only the attention layer? If so, how can I achieve this? Thanks!
I'm curious if by retraining you might mean continued training? https://docs.nvidia.com/nemo-framework/user-guide/latest/llms/allmodels/continuetraining.html?highlight=continued%2520training#configure-continual-learning
Regarding your error, the decoder module will be MegatronGPTModel.model.decoder I believe. But you can check by inspecting the MegatronGPTModel
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue was closed because it has been inactive for 7 days since being marked as stale.