`StopIteration` error when running `factory.build_batch` sequentially for multiple factories involving `factory.Iterator`
Description
When attempting to bootstrap an SQLite database for robust unit testing, generating data fixtures requires running multiple factory in serial to produce the necessary data. To facilitate this, I am mapping multiple data factories into a singular imperative function.
When calling model factories one-by-one using multiple session instances, the factories can produce the necessary code. Otherwise, the StopIteration error appears whenever the factory.Iterator field is invoked to generate the necessary value.
To Reproduce
Either from juptyer lab or from the __main__ call section of a python file, the following block is invoked.
list(
map(
generate_rows_to_db,
[UserFactory, TagFactory, TagCaseFactory],
repeat(session_inst),
)
)
Model / Factory code
def generate_rows_to_db(model_factory, session_obj: Session, batch_size: int = 25):
"""
This function generates data from a provided factory_boy SQLAFactory
instance and writes the results to a database's connection via a provided
session instance. This helps to automate the data generation process to
use randomized data and store the results in a RDBMS solution. Args:
model_factory:
session_obj:
Returns: None
"""
model_name = model_factory._meta.model.__tablename__
print(f"Session Obj URL for {model_name} is: {session_obj.bind.url}")
rows = model_factory.build_batch(batch_size)
session_obj.add_all(rows)
try:
session_obj.commit()
except Exception as e:
print(type(e), e, e.args)
session_obj.rollback()
engine = create_engine(
SQLALCHEMY_DATABASE_URL, connect_args={"check_same_thread": False}
)
session_inst = scoped_session(
sessionmaker(autocommit=False, autoflush=False, bind=engine)
)
class BaseFactory(factory.alchemy.SQLAlchemyModelFactory):
IsActive = factory.Faker("pybool")
CreateDate = factory.Faker("date_time")
CreateBy = factory.Faker("name")
UpdateDate = factory.LazyAttribute(lambda o: get_random_later_date(o.CreateDate))
UpdateBy = factory.LazyAttribute(lambda o: o.CreateBy)
class UserFactory(BaseFactory):
class Meta:
model = User
sqlalchemy_session = session_inst
# sqlalchemy_session_persistence = "commit"
UserId = factory.Sequence(lambda n: n)
NUID = factory.LazyAttribute(
lambda o: f"{o.FirstName.lower()}." f"{o.LastName.lower()}@foo_bar.oolala"
)
FirstName = factory.Faker("first_name")
LastName = factory.Faker("last_name")
UserRole = factory.RelatedFactory(
"tests.factories.UserRoleFactory",
factory_related_name="User",
)
class RoleFactory(BaseFactory):
class Meta:
model = Role
sqlalchemy_session = session_inst
# sqlalchemy_session_persistence = "commit"
RoleId = factory.Sequence(lambda n: n)
ShortName = factory.Faker("word")
Code = factory.Faker("pystr", max_chars=4)
Description = factory.Faker("text", max_nb_chars=200)
class UserRoleFactory(BaseFactory):
class Meta:
model = UserRole
sqlalchemy_session = session_inst
# sqlalchemy_session_persistence = "commit"
UserRoleId = factory.Sequence(lambda n: n)
Role = factory.SubFactory(RoleFactory)
@factory.post_generation
def default_package(self, create, _, **__):
UserFactory(UserRole=self)
class TagFactory(BaseFactory):
class Meta:
model = Tag
sqlalchemy_session = session_inst
# sqlalchemy_session_persistence = "commit"
TagId = factory.Sequence(lambda n: n)
UserId = factory.Iterator(
list(
map(
lambda x: getattr(x, "UserId"),
Meta.sqlalchemy_session.execute(select(User.UserId)).all(),
)
)
)
Name = factory.Faker("word")
Description = factory.Faker("text", max_nb_chars=200)
class TagCaseFactory(BaseFactory):
class Meta:
model = TagCase
sqlalchemy_session = session_inst
# sqlalchemy_session_persistence = "commit"
TagCaseId = factory.Sequence(lambda n: n)
TagId = factory.Iterator(
list(
map(
lambda x: getattr(x, "TagId"),
Meta.sqlalchemy_session.execute(select(Tag.TagId)).all(),
)
)
)
CaseId = factory.Faker("pyint", max_value=200)
The issue
After writing data from the initial factory, subsequent factory instances fail to write data to the DB, often producing the StopIteration error message due to a field invoking the factory.Iterator callable that requires running a SQLA query to pull the necessary options.
list(
map(
generate_rows_to_db,
[UserFactory, TagFactory, TagCaseFactory],
repeat(session_inst),
)
)
Notes
I know the iteration recipe shown in the documentation is heavily tilted towards the Django ORM (which I love dearly) but it'd be nice if we could see some more robust examples/recipes involving SQLA.
Addendum: I have attempted to run this with factory.create_batch(n) with Meta declaring the sqlalchemy_session and the sqlalchemy_session_persistence = 'commit' options, but the same errors persist.
I think a stack trace would help illustrate the issue, so that others can see what the call is doing.
Iterator is using cycle=True by default, so it shouldn’t be exhausted and repeat infinitely. Looks like a bootstrapping issue, because map() calls the iterables in parallel, so the Factory tries to load data from an empty table, thus can’t iterate over the values and a StopIteration is raised.
Closing for inactivity.