pathpyG
pathpyG copied to clipboard
IndexError in model selection for tube data
The model selection fails with an IndexError for the London Tube data set.
Minimal code to reproduce the error:
import pathpyG as pp
paths_tube = pp.PathData.from_ngram('../data/tube_paths_train.ngram', sep=',', weight=True)
m = pp.MultiOrderModel.from_PathData(paths_tube, max_order=2)
m.estimate_order(paths_tube, max_order=2, significance_threshold=0.01)
I suspect that this has to do with the use of append_walks in the from_ngram function, which already concatenates the Data objects in PathData. The model selection code seems to assume that all paths are stored in individual data objects.