challenges_2021 icon indicating copy to clipboard operation
challenges_2021 copied to clipboard

Challenge #12- Size, precision, speed - pick two: implementation

Open EsperanzaCuartero opened this issue 5 years ago • 1 comments

Challenge 12 - Size, precision, speed - pick two: implementation

Stream 1 - Software development for weather, climate and atmosphere

Goal

This project is a follow-up of the ESoWC 2020 data encoding optimisation challenge. Based on the results and the findings of the completed project we will implement improved data packing configuration in our production streams. We would also like to analyze some new atmospheric composition and meteorological datasets.

Mentors and skills

  • Mentors: @miha-at-ecmwf @juanjodd
  • Skills required:
    • Some knowledge of meteorological data formats (GRIB, NetCDF) and libraries to decode and manipulate them (ecCodes, netcdf, cdo, nco, ..)
    • Some knowledge about data encoding (data packing, accuracy, compression methods)
    • Knowledge of a software library to compute and present the results
    • Some familiarity with Chemical Transport Modelling (CTM) or Numerical Weather Prediction (NWP) to be able to better appreciate this challenge would be beneficial

Note: Challenge is funded by Copernicus. Only nationals from the European Union and ECMWF Member States are eligible to apply (see Terms and Conditions).


Challenge description

Data and software We plan to use the CAMS global real-time forecast dataset, ecCodes and NetCDF libraries to test different configurations and estimate data encoding errors and software library to compute and present results (Python, R or Julia).

What is the current problem? Due to non-optimal data encoding configuration, there is a lot of artificial precision in our data. Datasets are expensive to archive and move and difficult to use.

What could be the solution? We would like to remove artificial precision from the encoded fields without any loss of information. At the same time, we need to be conscious of operational constraints, so data encoding and decoding steps do not become prohibitively expensive. The desired solution would be a combination of data encoding settings and step to achieve this goal.

Ideas for the implementation Things to address: more appropriate packing methods, encoding float arrays, explore usage of suitable data compression algorithms.


ESoWC

EsperanzaCuartero avatar Jan 28 '21 11:01 EsperanzaCuartero

Hi, join us for the ECMWF Summer of Weather Code Ask Me Anything session and learn all things ESoWC.

When: Wednesday, 24 March 2021 at 4 pm GMT

What: learn everything about ESoWC - how it works, the challenges this year, some tips for your proposal and listen to ESoWC experiences from previous participants

How: register here.

jwagemann avatar Mar 22 '21 09:03 jwagemann