cloud_science icon indicating copy to clipboard operation
cloud_science copied to clipboard

MUR SST Zarr `fill_value` should be `nan`?

Open abarciauskas-bgse opened this issue 4 years ago • 1 comments

Chelle mentioned that

[In] printing out points over land, they should be nan. All files have nan except the rechunked zarr file created after our version 1 zarr. I tried to read zarr two different ways, write it out to see if that was the issue but wasn't able to recreate. somehow in the 2nd zarr file all the land values got set to a flag value.

The flag value is -32768 - listed in the zarr metadata

The data is probably coming down as the official cloud version goes up, but we are just trying to figure out how this happened, if we any problem with xarray/zarr or something easy to do that we need to be careful about when converting files.

Chelle also mentioned that this issue is the same as posted by @rsignell-usgs: https://github.com/pangeo-data/rechunker/issues/59 Chelle tried both cloud s3 & local: It would create the final but not intermediate file. This is code to try on a subset of mur data to check if it is the rechunker step that is filling the fill_value: https://github.com/cgentemann/cloud_science/blob/master/make_zarr/test_mur_rechunker.ipynb

I will try replicating the error and post any updates here.

cc @cgentemann

abarciauskas-bgse avatar Aug 26 '21 14:08 abarciauskas-bgse

Starting to look at the metadata in the AWS mur-sst bucket and have found taht the fill_value for zarr-v1 is

Link: https://mur-sst.s3.us-west-2.amazonaws.com/zarr-v1/analysed_sst/.zarray

"fill_value": -32768,

Link: https://mur-sst.s3.us-west-2.amazonaws.com/zarr/analysed_sst/.zarray

"fill_value": null,

abarciauskas-bgse avatar Aug 27 '21 20:08 abarciauskas-bgse