Discussion: project stucture
(involving @sladkovm) A lot of different algorithms, tools and models will be added so I think is good to already think about where everything is going so we don't end up with a labyrinth. I like this guide on structuring Python projects.
I think the structure below could work but maybe I'm missing things or you know a better structure.
.
+-- sweat
| +-- __init__.py
| +-- algorithms
| +-- __init__.py
| +-- pdm (power duration models)
| +-- __init__.py
| +-- critical_power.py
| +-- w_prime_balance.py
| +-- metrics
| +-- __init__.py
| +-- power.py
| +-- heartrate.py
| +-- speed.py
| +-- location.py
| +-- io
| +-- __init__.py
| +-- strava.py
| +-- goldencheetah.py
| +-- fitfile.py
| +-- models
| +-- __init__.py
| +-- dataframes.py
| +-- base.py
| +-- mixins.py
| +-- utils.py
| +-- utils.py
+-- tests
In general looks good.
Splitting metrics into modules according to the observed variable poses a danger of duplicating similar calculations twice. One of the examples - algorithms.streams.compute_zone() can be applied for both hr and power.
Would be nice to add a preprocess package to keep there frequently used data clean up methods. For example algorithms.streams.median_filter()
That's a good example indeed: In my opinion there should not be 1 method that calculates time in zone for both power and heartrate. This should be handled by separate methods (in my example e.g. heartrate.time_in_zone and power.time_in_zone). Duplicate code for these methods should live in algorithms.utils or algorithms.metrics.utils.
It's probably my fault but I still do not understand your idea for the streams... All algorithms work on array-like structures right? Why is there a separate module? Can you explain this to me?
I don't keep streams - they are a legacy of vmpy. The content of both streams.py and metrics.py will move into whatever we decide is good for a project structure.
To make sure I understand your proposal correctly:
algorithms.metrics.utils holds the juice of all calculations that can be reused. I would call it metrics.core - after all, it will represent the added value the sweat as a package
Context-specific modules e.g. metrics.power, metrics.heartrate offer convenient wrappers functions with simple unambiguous interface.
Sounds like a plan. And I like calling it metrics.core. Good work!
This might be a little late to the party but I was looking at the current projects structure yesterday and I realized that since the algorithms are the main part of this library it's strange that they're not accessible from the root of the project. How would you feel about "unnesting" the algorithms so that from sweat.algorithms.metrics import power will become from sweat.metrics import power?
Great you brought it up - I thought about exactly the same ;)
I guess I have to give you credit for changing my mind from wdf-centric to algorithm-centric 😛 I will pick this up tonight.