NeMo icon indicating copy to clipboard operation
NeMo copied to clipboard

AttributeError: 'MegatronGPTModel' object has no attribute 'decoder'

Open lianghsun opened this issue 1 year ago • 1 comments

Description

I am retraining a LLaMA3 model. Due to the limited size of my dataset, I attempted to use freeze_updates as referenced in the NVIDIA NeMo documentation. My configuration is as follows:

freeze_updates:
  enabled: true  # set to false if you want to disable freezing
  modules:   # list all of the modules you want to have freezing logic for
    decoder: 100

However, I encountered the following error:

AttributeError: 'MegatronGPTModel' object has no attribute 'decoder'

I also tried changing decoder to encoder or joint, but I still faced errors. I would like to ask how to properly configure this setting?

Additionally, within the NeMo framework, is it possible to freeze specific layers, such as only the attention layer? If so, how can I achieve this? Thanks!

lianghsun avatar Aug 03 '24 01:08 lianghsun

I'm curious if by retraining you might mean continued training? https://docs.nvidia.com/nemo-framework/user-guide/latest/llms/allmodels/continuetraining.html?highlight=continued%2520training#configure-continual-learning

Regarding your error, the decoder module will be MegatronGPTModel.model.decoder I believe. But you can check by inspecting the MegatronGPTModel

ericharper avatar Aug 23 '24 22:08 ericharper

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Sep 23 '24 01:09 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar Sep 30 '24 02:09 github-actions[bot]