The email index is not used by `get_by_email()` method
Describe the bug
An index is created on the email field:
https://github.com/fastapi-users/fastapi-users-db-sqlalchemy/blob/main/fastapi_users_db_sqlalchemy/init.py#L61
but the index can't be used by the get_by_email() method because the filtering uses a function on the SQL field.
To Reproduce
explain analyze
SELECT auth_user.hashed_password, auth_user.is_active, auth_user.is_superuser, auth_user.is_verified, auth_user.id, auth_user.email, auth_user.first_name, auth_user.last_name
FROM auth_user
WHERE lower(auth_user.email) = lower('[email protected]');
Seq Scan on auth_user (cost=0.00..10.90 rows=1 width=1245) (actual time=0.030..0.030 rows=0 loops=1)
Filter: (lower((email)::text) = '[email protected]'::text)
Sequential scan is used instead of index scan.
Expected behavior
explain analyze
SELECT auth_user.hashed_password, auth_user.is_active, auth_user.is_superuser, auth_user.is_verified, auth_user.id, auth_user.email, auth_user.first_name, auth_user.last_name
FROM auth_user
WHERE auth_user.email = lower('[email protected]');
Index Scan using ix_auth_user_email on auth_user (cost=0.14..2.36 rows=1 width=1245) (actual time=0.044..0.045 rows=0 loops=1)
Index Cond: ((email)::text = '[email protected]'::text)
There is several ways to fix the issue:
- use CIText extension but it's not standard in SQLAlchemy (https://pypi.org/project/sqlalchemy-citext/).
- create a functional index:
CREATE INDEX ix_auth_user ON auth_user (lower(email) text_pattern_ops); - store email in lower case (my favorite even though on rare occasions, an outdated server or program might not interpret the capitalization correctly).
I can provide a PR when we'll agree on a solution.
Configuration
- FastAPI Users version : 0.7
Thank you for raising this, @stephane! That's indeed a welcome optimization.
My preference would go for 2, i.e. create a functional index.
I prefer not to rely on custom extension, especially since CIText looks only designed for PostgreSQL.
As for storing the e-mail in lower-case, this is a debate we had some time ago and we purposefully chose to store the e-mails case-sensitive but query them case-insensitive.
If you agree with this, I would be happy to review a PR that creates a lowercased index for e-mails 🙂
Django offers the possibility to provide migrations in third-apps code but I didn't find a similar mechanism with Alembic.
With the new __table_args__, Alembic detects the missing index in my app but displays a warning:
UserWarning: autogenerate skipping functional index ix_user_email_lower; not supported by SQLAlchemy reflection
So I had to manually create the Alembic migration in my project and it's convenient for users of fastapi-users.
op.create_index("ix_user_email_lower", "user", [sa.text("lower(email)")])