wfdb-python icon indicating copy to clipboard operation
wfdb-python copied to clipboard

HeaderSyntaxError encountered when processing the PTB-XL+ database

Open wenh06 opened this issue 1 year ago • 1 comments

WFDB version: 4.1.2 header file: https://physionet.org/content/ptb-xl-plus/1.0.1/median_beats/12sl/00000/00001_medians.hea header file content:

ge_median_beats_wfdb/00001_medians 12 500 600
ge_median_beats_wfdb/00001_medians.dat 32 6344117.125(-1842966025)/mV 32 0 -1995224836 10742 0 I
ge_median_beats_wfdb/00001_medians.dat 32 9481164.0(-2062153171)/mV 32 0 -1882011055 51500 0 II
ge_median_beats_wfdb/00001_medians.dat 32 12413200.271062272(1241320027)/mV 32 0 1775087638 10624 0 III
ge_median_beats_wfdb/00001_medians.dat 32 7809031.442164179(2038157206)/mV 32 0 2038157206 33835 0 aVR
ge_median_beats_wfdb/00001_medians.dat 32 9004124.293103449(-1625244438)/mV 32 0 -1931384663 20616 0 aVL
ge_median_beats_wfdb/00001_medians.dat 32 26843545.555555556(-1905891737)/mV 32 0 -993211188 39013 0 aVF
ge_median_beats_wfdb/00001_medians.dat 32 5577879.601744186(1690097519)/mV 32 0 1667786000 62231 0 V1
ge_median_beats_wfdb/00001_medians.dat 32 2425165.043971631(1271999065)/mV 32 0 1167716968 11832 0 V2
ge_median_beats_wfdb/00001_medians.dat 32 4021504.95543672(108580633)/mV 32 0 124666652 16354 0 V3
ge_median_beats_wfdb/00001_medians.dat 32 5302428.75409836(-853691031)/mV 32 0 -779457028 41525 0 V4
ge_median_beats_wfdb/00001_medians.dat 32 7110873.0(-2047931425)/mV 32 0 -1877270473 53744 0 V5
ge_median_beats_wfdb/00001_medians.dat 32 8323580.0(-2072571427)/mV 32 0 -1956041307 23344 0 V6

error message:

/opt/hostedtoolcache/Python/3.9.21/x64/lib/python3.9/site-packages/wfdb/io/record.py:1853: in rdheader
    record_fields = _header._parse_record_line(header_lines[0])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

record_line = 'ge_median_beats_wfdb/00001_medians 12 500 600'

    def _parse_record_line(record_line: str) -> dict:
        """
        Extract fields from a record line string into a dictionary.
    
        Parameters
        ----------
        record_line : str
            The record line contained in the header file
    
        Returns
        -------
        record_fields : dict
            The fields for the given record line.
    
        """
        # Dictionary for record fields
        record_fields: Dict[str, Any] = {}
    
        # Read string fields from record line
        match = rx_record.match(record_line)
        if match is None:
>           raise HeaderSyntaxError("invalid syntax in record line")
E           wfdb.io.header.HeaderSyntaxError: invalid syntax in record line

/opt/hostedtoolcache/Python/3.9.21/x64/lib/python3.9/site-packages/wfdb/io/_header.py:1021: HeaderSyntaxError

I think this is caused by the mismatch of the wfdb.io.header.rx_record with the first line of header file ge_median_beats_wfdb/00001_medians 12 500 600. rx_record is as follows:

rx_record = re.compile(
    r"""
    [ \t]* (?P<record_name>[-\w]+)
           /?(?P<n_seg>\d*)
    [ \t]+ (?P<n_sig>\d+)
    [ \t]* (?P<fs>\d*\.?\d*)
           /*(?P<counter_freq>-?\d*\.?\d*)
           \(?(?P<base_counter>-?\d*\.?\d*)\)?
    [ \t]* (?P<sig_len>\d*)
    [ \t]* (?P<base_time>\d{,2}:?\d{,2}:?\d{,2}\.?\d{,6})
    [ \t]* (?P<base_date>\d{,2}/?\d{,2}/?\d{,4})
    """,
    re.VERBOSE,
)

On can resolve this by changing the first line of rx_record from [ \t]* (?P<record_name>[-\w]+) to [ \t]* (?P<record_name>[-\w\/]+).

@briangow @bemoody

wenh06 avatar Jan 17 '25 07:01 wenh06

It seems that one should not modify rx_record and rx_signal. This would result in FileNotFoundError when calling rdrecord since the ge_median_beats_wfdb folder does not exist. The PTB-XL+ database should be updated.

wenh06 avatar Jan 17 '25 08:01 wenh06