wfdb-python
wfdb-python copied to clipboard
New interface for specifying different data sources for read/write
Looking at the current rdrecord for example, there are two parameters used to specify the location of the record:
-
record_name: str -
pn_dir: str
The current package supports reading files locally and from the global database index URL, which defaults to PhysioNet, as specified in download.py.
There are several things that we should aim to support:
- Reading/writing from more types of data sources, such as S3, and GCS.
- Having more than one remote source configured at a time.
One proposal might be to have a new DataSource class, and a global config dictionary with key:value pairs of ds_name(str):ds(DataSource). ie.
class DataSourceType(Enum):
LOCAL = 1 # Not sure if this is necessary?
HTTP = 2
GCS = 3
S3 = 4
class DataSource:
ds_type : DataSourceType
# Other type-specific params here
_physionet_ds = DataSource(ds_type=DataSourceType.HTTP, base_url="https://physionet.org/content/")
data_sources = { 'physionet' : _physionet_ds }
And the read/write functions could use these params:
-
record_name: str -
data_source: str | DataSource - The key of the data source in the global data sources map, or aDataSourceobject.
This would be much more explicit. Thoughts?