perf(export): export generates unnecessary files content
SUMMARY
Currently, exporting models functionality is slower than it can be.
Model export is slow because models export their related models, and since related models cannot check if they have already been exported to the final archive, they are exported anyway. This leads to generating unnecessary export model files. For example, exporting a dashboard with 2 charts will generate 4 files - a dashboard, two charts, a dataset, and a database, but ““behind the scenes” there will actually be 7 files generated, since each chart will re-export the dataset and database, and at the end, the highest-level export model (dashboard export) will remove these duplicates.
To solve this problem, it is proposed to start generating export files not immediately, but only when they are written to a file, by convert file_content to function that generates content when we call it.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
pay attention to speed up of generating an dashboard export archive!
Before
https://github.com/apache/superset/assets/66589759/76551088-5d87-4c4d-b89d-fcc4c66f1b27
After
https://github.com/apache/superset/assets/66589759/4489b671-6c17-4f9a-a955-e5b285f5e5c4
TESTING INSTRUCTIONS
ADDITIONAL INFORMATION
- [ ] Has associated issue:
- [ ] Required feature flags:
- [ ] Changes UI
- [ ] Includes DB Migration (follow approval process in SIP-59)
- [ ] Migration is atomic, supports rollback & is backwards-compatible
- [ ] Confirm DB migration upgrade and downgrade tested
- [ ] Runtime estimates and downtime expectations provided
- [ ] Introduces new feature or API
- [ ] Removes existing feature or API
Hello @Always-prog thanks for the PR! Would you mind taking a look at the failing CI steps?
Codecov Report
Attention: 14 lines in your changes are missing coverage. Please review.
Comparison is base (
4796484) 67.18% compared to head (9d2e16e) 69.51%.
Additional details and impacted files
@@ Coverage Diff @@
## master #26765 +/- ##
==========================================
+ Coverage 67.18% 69.51% +2.33%
==========================================
Files 1900 1900
Lines 74443 74491 +48
Branches 8293 8293
==========================================
+ Hits 50012 51785 +1773
+ Misses 22376 20651 -1725
Partials 2055 2055
| Flag | Coverage Δ | |
|---|---|---|
| hive | 53.82% <46.08%> (?) |
|
| mysql | 78.04% <78.26%> (+0.01%) |
:arrow_up: |
| postgres | 78.14% <78.26%> (+0.01%) |
:arrow_up: |
| presto | 53.77% <46.08%> (?) |
|
| python | 83.09% <87.82%> (+4.83%) |
:arrow_up: |
| sqlite | 77.66% <78.26%> (+0.01%) |
:arrow_up: |
| unit | 56.49% <60.00%> (?) |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@betodealmeida Hi! Thanks for review! I have committed fixes by your comments.
But the only thing that pre-commit fails due to file .github/workflows/update-monorepo-lockfiles.yml, which I have not fixed.
Hi @Always-prog it seems you have a conflicting file. Can you fix it, please? I am adding myself as a reviewer and checking back on this asap. Thank you!
👀
@geido Hi! Rebased!
/testenv up
@geido Ephemeral environment spinning up at http://54.191.84.78:8080. Credentials are admin/admin. Please allow several minutes for bootstrapping and startup.
@michael-s-molina Thank you for testing my PR. Can I get merge?
Ephemeral environment shutdown and build artifacts deleted.