gtfs-validator icon indicating copy to clipboard operation
gtfs-validator copied to clipboard

Performance issues with trip_distance_exceeds_shape_distance

Open emmambd opened this issue 2 years ago • 1 comments

Describe the bug

Upon generating analytics for the 4.2 release, we found that several datasets failed to run, and we assume that's because of the new trip_distance_exceeds_shape_distance error.

These datasets failed with the message The configured memory limit was reached:

  1. us-oregon-trimet-portland-streetcar-gtfs-247
  2. us-washington-sound-transit-metro-transit-city-of-seattle-king-county-metro-gtfs-267
  3. us-illinois-chicago-transit-authority-cta-gtfs-389
  4. us-new-jersey-new-jersey-transit-nj-transit-gtfs-508
  5. at-wien-wiener-lokalbahnen-wlb-gtfs-648
  6. be-vlaams-gewest-de-lijn-gtfs-684
  7. ca-alberta-edmonton-transit-system-gtfs-714
  8. ca-ontario-toronto-transit-commission-gtfs-732
  9. ca-quebec-reseau-de-transport-de-la-capitale-gtfs-757
  10. de-berlin-verkehrsverbund-berlin-brandenburg-gtfs-782
  11. Ie-dublin-dublin-bus-gtfs-947
  12. fr-auvergne-rhone-alpes-cars-region-auvergne-rhone-alpes-transisere-gtfs-985
  13. es-madrid-cercanias-madrid-gtfs-993
  14. nl-unknown-allgo-keolis-gtfs-1077
  15. ee-unknown-abuss-ou-gtfs-1095
  16. gb-unknown-transport-for-greater-manchester-arriva-in-the-north-west-gtfs-1103
  17. fi-unknown-porvoon-museorautatie-gtfs-1102
  18. be-unknown-societe-regionale-wallonne-du-transport-gtfs-1212
  19. it-lombardia-agenzia-mobilita-ambiente-territorio-gtfs-1231
  20. gr-attiki-athens-urban-transport-organisation-organismos-astikon-sugkoinonion-oasa-gtfs-1228
  21. ru-sankt-peterburg-peterburgskii-metropoliten-petersburg-metro-gtfs-1186
  22. tw-unknown-taichung-gtfs-1277
  23. dk-unknown-rejseplanen-gtfs-1292
  24. gb-unknown-chiltern-railways-gtfs-1311
  25. pt-lisboa-carris-metropolitana-gtfs-1873

These datasets failed with the message Timeout of 1800 seconds exceeded:

  1. us-unknown-amtrak-gtfs-11
  2. tn-unknown-uabs-banlieue-sahel-gtfs-1016

This dataset failed with the message connection broken:

  1. us-minnesota-metro-transit-metro-transit-met-council-maple-grove-plymouth-southwest-transit-airport-mac-university-of-minnesota-catch-the-link-gtfs-205

We should further investigate and identify possible performance improvements to this notice. Relates to #1589

Steps/Code to Reproduce

  • Run failed dataset through validator
  • Feed times out with no error message

Expected Results

  • Feeds run successfully through the validator.

Actual Results

Feed fails to parse.

Screenshots

No response

Files used

No response

Validator version

4.2

Operating system

MacOS

Java version

No response

Additional notes

No response

emmambd avatar Nov 27 '23 15:11 emmambd

In the memory reduction design & analysis document, it's mentioned that there's a significant amount of memory usage from the TripAndShapeDistanceValidator. This should be further investigated.

emmambd avatar Feb 14 '24 21:02 emmambd