mne-python icon indicating copy to clipboard operation
mne-python copied to clipboard

Opportunity for code improvement in get_annotations_per_epoch

Open davidcian opened this issue 2 years ago • 3 comments

Finding the overlap of two collections of intervals is a very common issue for people working with time series data. In MNE, this is encountered in the get_annotations_per_epoch method. I couldn't help but notice that in its implementation, there are three cases which are identified and treated separately, each with two comparisons: start within bounds, stop within bounds, or both within bounds.

In truth, only two comparisons need to hold at the same time: start is before stop of the other interval, and stop is after start of the other interval.

I apologize in advance for making this into an issue, but I didn't really know where else to post it. The performance improvement is likely not huge for modest amounts of data, but it is an improvement nonetheless.

davidcian avatar Dec 13 '23 22:12 davidcian

Hello! 👋 Thanks for opening your first issue here! ❤️ We will try to get back to you soon. 🚴

welcome[bot] avatar Dec 13 '23 22:12 welcome[bot]

And it might de-duplicate code and improve readability, feel free to open a PR :)

mscheltienne avatar Dec 14 '23 14:12 mscheltienne

I don't want to sound discouraging here, but IMO the code is already quite readable, well-commented, and likely very efficient (each of the 3 conditions is computed with a single line of numpy code: 2 array comparisions wrapped in a logical_and). Still, if you're gung-ho to improve it @davidcian I would say go for it, with the understanding that the speedup may be imperceptible to a user, and we'll want the code/comments to remain at least as readable as they are now.

https://github.com/mne-tools/mne-python/blob/1034bffde6fe4a360da3fe155b3c16bdc6380b8d/mne/annotations.py#L866-L895

drammock avatar Dec 14 '23 20:12 drammock