xradar icon indicating copy to clipboard operation
xradar copied to clipboard

Migrating mpl reader from ACT to xradar

Open zssherman opened this issue 2 years ago • 3 comments

After the ACT dev call, we discussed on how moving the MPL reader to xradar would be more fitting: https://github.com/ARM-DOE/ACT/issues/806

I can give this a shot. I will just need to learn the backends of Xarray first.

zssherman avatar Mar 08 '24 16:03 zssherman

@zssherman Great initiative! It looks like MPL is also some binary format with neat header structures.

The sigmet/iris reader heavily uses these kind of structured decoding. This is also what #158 is trying to achieve for nexrad level2.

Maybe we can discuss on next open radar meeting which steps are necessary to get a prototype reader ready.

kmuehlbauer avatar Mar 08 '24 17:03 kmuehlbauer

@kmuehlbauer That sounds good to me!

zssherman avatar Mar 08 '24 19:03 zssherman

@zssherman Since there wasn't much time yesterday I'll follow up with some ideas/pointers here.

  • File Format Description: https://www.dropletmeasurement.com/manual/software-manual-sigmampl, pp. 46-49
  • mpl2nc tool by @peterkuma https://github.com/peterkuma/mpl2nc for inspiration

I'm not really sure how to handle the sidecar files, but we might just search/recognize them and directly read/decode as binary blobs (when without header).

For the main file the idea would be to use np.memmap for easy reading large data. See

https://github.com/openradar/xradar/blob/56a9ca1f42ca23074dc0b1d2a86d51e1bd4eafa2/xradar/io/backends/nexrad_level2.py#L137-L151

Then the header could be directly extracted using the machinery from the iris/sigmet reader:

https://github.com/openradar/xradar/blob/56a9ca1f42ca23074dc0b1d2a86d51e1bd4eafa2/xradar/io/backends/nexrad_level2.py#L187-L190

For this the header structure needs some special layout, where decoding information can be attached into the OrderedDict:

https://github.com/openradar/xradar/blob/56a9ca1f42ca23074dc0b1d2a86d51e1bd4eafa2/xradar/io/backends/nexrad_level2.py#L725-L733

The actual data might be read with dedicated functions (eg names like get_data or similar), which uses header information about file offset, size and dtype. See the following for a (not so nice example):

https://github.com/openradar/xradar/blob/56a9ca1f42ca23074dc0b1d2a86d51e1bd4eafa2/xradar/io/backends/nexrad_level2.py#L576-L608

This get_data function is used in the ArrayWrapper to retrieve the data in a lazy manner, whereas header data is used to provide the information to create the DataArrays/Dataset.

https://github.com/openradar/xradar/blob/56a9ca1f42ca23074dc0b1d2a86d51e1bd4eafa2/xradar/io/backends/nexrad_level2.py#L1227

This is then used in the XarrayStore to provide Variables/Coordinates

https://github.com/openradar/xradar/blob/56a9ca1f42ca23074dc0b1d2a86d51e1bd4eafa2/xradar/io/backends/nexrad_level2.py#L1310

https://github.com/openradar/xradar/blob/56a9ca1f42ca23074dc0b1d2a86d51e1bd4eafa2/xradar/io/backends/nexrad_level2.py#L1328

I hope this does at least make some sense and you could give it a try.

kmuehlbauer avatar Mar 14 '24 07:03 kmuehlbauer