clean-and-green-philly icon indicating copy to clipboard operation
clean-and-green-philly copied to clipboard

Task: Add quality check for initial vacant properties dataset

Open nlebovits opened this issue 1 year ago • 1 comments

Describe the task

Implement functionality to check that the initial dataset of vacant properties is no more than 5% smaller than the previous run or, if there was no previous run, no smaller than 30,000 records. If this condition is not met, send an alert via Slack and email. This will involve modifying the script to query the previous record count from the PostgreSQL database and compare it with the current record count. Changes will need to be made to ./data/src/script.py and ./data/src/classes/diff_report.py.

Acceptance Criteria

  • [ ] Add a method to DiffReport class in data/src/classes/diff_report.py to get the previous record count from the PostgreSQL database.
  • [ ] Modify the main script in data/src/script.py to use the DiffReport method to get the previous record count.
  • [ ] Implement a comparison of the current and previous record counts in the main script.
  • [ ] If the current count is more than 5% smaller than the previous count or if the initial count is smaller thna 30,000 records, break and:
    • [ ] Send a Slack alert.
    • [ ] Optionally, send an email alert.

Additional context

  • The Slack alert should use the Slack API and be configured to post to a specific channel.
  • The email alert should use the SMTP library and be sent to a configured email address.
  • Ensure proper error handling and logging for both Slack and email alerts.

nlebovits avatar Aug 02 '24 14:08 nlebovits

related to #848. Any quality failure should raise an exception and get caught and reported to Slack.

zigouras avatar Aug 04 '24 19:08 zigouras

This issue has been marked as stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Oct 01 '24 16:10 github-actions[bot]

effectively a duplicate of #848

nlebovits avatar Oct 03 '24 15:10 nlebovits