scout icon indicating copy to clipboard operation
scout copied to clipboard

Add more robust handling of RESDBOUT format to mseg.py

Open jtlangevin opened this issue 10 years ago • 2 comments

The EIA 2015 RESDBOUT.txt file has a quirky format that is not properly parsed by mseg.py without pre-processing the raw data file in Excel. The primary issue with the file's format is its inclusion of an extra column ('BULBTYPE') that is only populated with values for certain rows. This appears to yield issues in identifying line breaks when parsing through the txt file.

Going forward, mseg.py must be amended to handle such quirks in the RESDBOUT file format without pre-processing via Excel.

jtlangevin avatar Nov 20 '15 15:11 jtlangevin

When this issue is addressed, it might also make sense to overhaul the way the data are stored once imported in mseg.py, which would facilitate streamlining of some of the data handling functions. Once completed, com_mseg.py should provide an example for how to import the data more flexibly and efficiently.

trynthink avatar Nov 29 '15 18:11 trynthink

Data import improvements in mseg.py were included in commit 01e52690aab303ee6f3cc8aa9c0b5b36b80cb98b. Further improvements to make data import more resilient and reduce or eliminate the need for preprocessing of RESDBOUT.txt might still be helpful.

trynthink avatar Jun 17 '17 18:06 trynthink