ModelHparams isn't a general interface as it implies
@dataclass
class ModelHparams(hp.Hparams, ABC):
initializers: List[Initializer] = hp.optional(
default_factory=lambda: [],
doc="The initialization strategy for the model",
)
num_classes: Optional[int] = hp.optional(
doc="The number of classes. Needed for classification tasks",
default=None,
)
@abstractmethod
def initialize_object(self) -> ComposerModel:
pass
num_classes is classification specific (mostly) and using initializers with torchvision or other integrated models will require wrapping them in constructor functions like landen did for deeplab. Right now many of our models are subclassing ModelHparams and overwriting or not using the built in properties.
perhaps ModelHparams should be:
@dataclass
class ModelHparams(hp.Hparams, ABC):
@abstractmethod
def initialize_object(self) -> ComposerModel:
pass
and we could have task specific interfaces to go with them?
@dataclass
class ClassifierHparams(ModelHparams):
num_classes: Optional[int] = hp.optional(
doc="The number of classes. Needed for classification tasks",
default=None,
)
@abstractmethod
def initialize_object(self) -> ComposerModel:
pass
@ravi-mosaicml , @Landanjs
I agree it makes sense to remove num classes from the base class. I would like to keep the initializers (for #472) but otherwise sg.
I'm in favor of remove num_classes and initializers, but would like to discuss more about keeping initializers in the base ModelHparams.
Additionally, removing optional arguments from the base class allows us to use required arguments 🎉 . To me, it would be great if task specific models like ClassifierHparams only had required hparams like num_classes. Then, subclasses should be adjusted to have required classes when desired.
Do you need help with this @A-Jacobson?
Re: initializers, would it make sense to allow the user to pass the path to a python function (that can be dynamically imported), in addition to the enum values? This would more easily allow for custom initialization when using yahp.
I can see defining a model level weight initialization scheme in one function as a callable but what do you mean by "in addition to the enum values"? I'm having a hard time imagining what this interface looks like.
Currently via the hparams path you can provide an Initializer enum to describe the initialization scheme. The hparams field would be a string -- it would first attempt to match against an enum, and if that fails, then it would attempt to match against a path to a python function.
Maybe? Perhaps with some kind of initializer registry? I'm still skeptical that these would be used across models but it may be good to cover the case where someone explicitly writes multiple initialization schemes for a specific model and wants to test the difference between them.
People would still have to write that argument into the initialization of their composermodel and or pytorch model though which isn't very common externally.
Since there's still design needed for the initializers (see #472 ), I propose we move num_classes out of the base, and then defer the initializers change until v0.5 to align with #472 .
YAHP is gone