factory_boy icon indicating copy to clipboard operation
factory_boy copied to clipboard

Add a support for automated factory generation from a descriptive model

Open rbarrois opened this issue 5 years ago • 4 comments

This issue will be used as the discussion basis for the automated factory generation feature.

The problem

Many libraries (ORMs, API schema languages, dataclasses.dataclass) provide a way to describe the fields of a class and their types. In those cases, it is cumbersome to have to add all the declarations manually; it would be great if factory_boy could provide a set of default declarations from an introspection of the class:

The typical example would be:

@dataclasses.dataclass
class User:
    id: int
    username: str
    fullname: str
    is_admin: bool

class UserFactory(factory.DataclassFactory):
    class Meta:
        model = User
        auto_declarations = True

>>> UserFactory()
User(id=42, username="john.doe", fullname="Jane Smith", is_admin=False)

Existing work

  • A branch has been started 5 years ago: https://github.com/FactoryBoy/factory_boy/commit/4046c55710d5d7073018dcc76aa3e8e5a7f803eb
  • A pull request has been restarted recently: #820
  • A simple hack was written in #330
  • A discussion on that topic occurred in #347
  • Usage with marshmallow is covered in #277

Design constraints

Developer experience

  • The provided API must be explicit: reading the code, one must know that some declarations have been automatically generated:
  • Any explicit declaration on the factory must have precedence on the automated generation;
  • It must be possible to restrict fields covered by the automated — either include only a subset, or exclude some fields — even if no explicit declaration exists (for instance to reuse a model-side default);
  • It must be possible to use this feature with make_factory;
  • Ideally, a bridge could be name with factory.Faker to use the field name as a hint (e.g calling factory.Faker("user_name") for a field called username).

Library integration

  • It should be easy to connect this feature with third party libraries in a project's code — either through an abstract Factory subclass, or through a custom FactoryOptions;
  • Introspection can be added on top of an existing abstract factory bridge to a third party library (i.e project A provides DjangoModelFactory; project B should be able to leverage it into AutoDjangoModelFactory)

Open questions

  • Should we integrate it directly into DjangoModelFactory / SQLAlchemyModelFactory, or provide as extra classes?
  • How should "foreign keys" be handled? Should their factories be autogenerated, or do we require an explicit declaration there?
  • How would a developer enrich the field name / faker generator mapping with their own custom providers and field naming conventions?

rbarrois avatar Jan 26 '21 11:01 rbarrois

Oh, this feature would be awesome!

If I can add my two cents regarding foreign keys: my team and I are used to manually creating generators for our models (which we want to stop doing XD). What has worked well for us is to only generate the required fields (that applies to FKs too). That means that every optional/nullable field will be set to None by default, while required foreign objects will be created (until there's no more foreign models). And all that is possible to override in the call site. So, for example, if a foreign object is part of a test of mine and I want to use that in a factory, I can simply pass it in as an argument.

About the integration with the ORM factories, I would make it integrated, possibly with a Meta attribute to disable that.


Edit: the library below does something like this. It could be looked at as inspiration, an example, or previous experience in implementing this.

https://github.com/klen/mixer

ggabriel96 avatar May 06 '21 16:05 ggabriel96

Hi, I would like to share a very alpha and simple PoC to generate a factory for dataclasses.

https://gist.github.com/mgaitan/dcbe08bf44a5af696f2af752624ac11b

it respects defaults, support builtin types, basic relationships, list/tuples/set, enums and email as a particular case based on the attribute name.

mgaitan avatar May 21 '21 20:05 mgaitan

I'm brand new to Factory Boy, and I just want to share that as a Django user with dozens of models and hundreds of fields, I'm quite surprised that Factory Boy doesn't have a way of inspecting a model/dataclass/etc and generate reasonable fakers for each field.

I've been reading docs for an hour or two, and it's only after I began working with the code I'm realizing I'm going to have to define Faker attributes for dozens of fields. I am pretty sad. Definitely not the turnkey kind of thing I was hoping for. Ah well.

Onwards with my explorations, but kinda bummed.

mlissner avatar Mar 08 '22 01:03 mlissner

Don't have too much experience with it myself yet, but at first glance the Pydantic ecosystem seems to be good for this kind of stuff. Might need a little up front investment. Check out https://github.com/Goldziher/pydantic-factories

schlich avatar Mar 08 '22 01:03 schlich