Make the html cleaning for microdata faster
Hey @kmike! Here is a small tweak to the https://github.com/scrapinghub/extruct/pull/119.
However, according to my recent performance tests, the code from https://github.com/scrapinghub/extruct/pull/119 doesn't affect performance and the code from this PR doesn't improve anything, so from my point of view, we may just close it.
Re technicals - it turned out that we can clean HTML just a single time, but without cleaning and tags.
Codecov Report
Merging #123 into master will decrease coverage by
0.05%. The diff coverage is100%.
@@ Coverage Diff @@
## master #123 +/- ##
==========================================
- Coverage 87.78% 87.73% -0.06%
==========================================
Files 11 11
Lines 475 473 -2
Branches 103 103
==========================================
- Hits 417 415 -2
Misses 52 52
Partials 6 6
| Impacted Files | Coverage Δ | |
|---|---|---|
| extruct/w3cmicrodata.py | 99.13% <100%> (-0.02%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 8683981...d8c03b7. Read the comment docs.
Codecov Report
Merging #123 into master will decrease coverage by
0.05%. The diff coverage is100%.
@@ Coverage Diff @@
## master #123 +/- ##
==========================================
- Coverage 87.78% 87.73% -0.06%
==========================================
Files 11 11
Lines 475 473 -2
Branches 103 103
==========================================
- Hits 417 415 -2
Misses 52 52
Partials 6 6
| Impacted Files | Coverage Δ | |
|---|---|---|
| extruct/w3cmicrodata.py | 99.13% <100%> (-0.02%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 8683981...d8c03b7. Read the comment docs.