Partial pipeline step stores plain passwords in the database
Making a new issue for @GergelyKalmar 's #494, which was closed for staleness despite noting a major security issue in the partial pipeline implementation and documentation. From that issue, which offers more details:
Expected behaviour When using a partial pipeline step sensitive data does not get stored in the database.
Actual behaviour Raw plain passwords are stored in the database while a partial pipeline is waiting to be resumed.
What are the steps to reproduce this issue?
- Enable the social_core.pipeline.mail.mail_validation partial pipeline or use any other partial pipeline
- Use the email or username auth backend
- Submit a form with the user email (or username) and password to begin the social auth process but do not validate the user email yet
- Inspect the data field in the social_auth_partial table Although the data is deleted once the validation is complete, there are circumstances where plain passwords may be accessible for a long period of time (e.g. if a backup is created in this time frame or if the user does not complete the validation for a while).
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
not stale
I think this at least needs to be documented.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Can this be reopened and can the stale bot be turned off? This is an ongoing security issue that is no longer visible in the backlog because of the bot.
Am I correct in reading that this is mainly an issue with documentation and not with any specific implementation of a backend? The documentation suggests passing password through partial pipeline, which is a bad idea. Or am I missing a larger implementation issue?
What are some mitigation ideas? Possibly always hashing the data field in database?
There is some more information in the original issue: https://github.com/python-social-auth/social-core/issues/494
It shows a workaround specifically for Django, but the idea is to hash the password before it reaches the partial pipeline with an additional step. It is more than a documentation issue though, optimally the default implementation should be changed to not store raw passwords.