datapoint-python icon indicating copy to clipboard operation
datapoint-python copied to clipboard

Daily and hourly data is returned in different formats

Open avee87 opened this issue 1 year ago • 12 comments

I'm working on updating Home Assistant integration to use the latest version of datapoint and it looks like daily and hourly forecasts are formatted very differently. I.e., hourly forecast has camelCase attribute names like shown in documentation but daily forecast uses TitleCase attributes. There are also some more significant differences between them.

Also, weather code is mapped to string in hourly version but is returned as numerical code in daily. To be fair, I personally would prefer the daily version here.

Here is daily format:

{
  "time": datetime.datetime(2024, 11, 23, 12, 0, tzinfo=datetime.timezone.utc),
  "10MWindSpeed": {
    "value": 8.13,
    "description": "10m Wind Speed at Local Midday",
    "unit_name": "metres per second",
    "unit_symbol": "m/s"
  },
  "10MWindDirection": {
    "value": 186,
    "description": "10m Wind Direction at Local Midday",
    "unit_name": "degrees",
    "unit_symbol": "deg"
  },
  "10MWindGust": {
    "value": 16.16,
    "description": "10m Wind Gust Speed at Local Midday",
    "unit_name": "metres per second",
    "unit_symbol": "m/s"
  },
  "Visibility": {
    "value": 12419,
    "description": "Visibility at Local Midday",
    "unit_name": "metres",
    "unit_symbol": "m"
  },
  "RelativeHumidity": {
    "value": 88.31,
    "description": "Relative Humidity at Local Midday",
    "unit_name": "percentage",
    "unit_symbol": "%"
  },
  "Mslp": {
    "value": 100032,
    "description": "Mean Sea Level Pressure at Local Midday",
    "unit_name": "pascals",
    "unit_symbol": "Pa"
  },
  "maxUvIndex": {
    "value": 1,
    "description": "Day Maximum UV Index",
    "unit_name": "dimensionless",
    "unit_symbol": "1"
  },
  "SignificantWeatherCode": {
    "value": 12,
    "description": "Day Significant Weather Code",
    "unit_name": "dimensionless",
    "unit_symbol": "1"
  },
  "MaxScreenTemperature": {
    "value": 11.88,
    "description": "Day Maximum Screen Air Temperature",
    "unit_name": "degrees Celsius",
    "unit_symbol": "Cel"
  },
  "UpperBoundMaxTemp": {
    "value": 14.1,
    "description": "Upper Bound on Day Maximum Screen Air Temperature",
    "unit_name": "degrees Celsius",
    "unit_symbol": "Cel"
  },
  "LowerBoundMaxTemp": {
    "value": 10.98,
    "description": "Lower Bound on Day Maximum Screen Air Temperature",
    "unit_name": "degrees Celsius",
    "unit_symbol": "Cel"
  },
  "MaxFeelsLikeTemp": {
    "value": 7.57,
    "description": "Day Maximum Feels Like Air Temperature",
    "unit_name": "degrees Celsius",
    "unit_symbol": "Cel"
  },
  "UpperBoundMaxFeelsLikeTemp": {
    "value": 10.88,
    "description": "Upper Bound on Day Maximum Feels Like Air Temperature",
    "unit_name": "degrees Celsius",
    "unit_symbol": "Cel"
  },
  "LowerBoundMaxFeelsLikeTemp": {
    "value": 7.57,
    "description": "Lower Bound on Day Maximum Feels Like Air Temperature",
    "unit_name": "degrees Celsius",
    "unit_symbol": "Cel"
  },
  "ProbabilityOfPrecipitation": {
    "value": 93,
    "description": "Probability of Precipitation During The Day",
    "unit_name": "percentage",
    "unit_symbol": "%"
  },
  "ProbabilityOfSnow": {
    "value": 0,
    "description": "Probability of Snow During The Day",
    "unit_name": "percentage",
    "unit_symbol": "%"
  },
  "ProbabilityOfHeavySnow": {
    "value": 0,
    "description": "Probability of Heavy Snow During The Day",
    "unit_name": "percentage",
    "unit_symbol": "%"
  },
  "ProbabilityOfRain": {
    "value": 93,
    "description": "Probability of Rain During The Day",
    "unit_name": "percentage",
    "unit_symbol": "%"
  },
  "ProbabilityOfHeavyRain": {
    "value": 92,
    "description": "Probability of Heavy Rain During The Day",
    "unit_name": "percentage",
    "unit_symbol": "%"
  },
  "ProbabilityOfHail": {
    "value": 19,
    "description": "Probability of Hail During The Day",
    "unit_name": "percentage",
    "unit_symbol": "%"
  },
  "ProbabilityOfSferics": {
    "value": 9,
    "description": "Probability of Sferics During The Day",
    "unit_name": "percentage",
    "unit_symbol": "%"
  }
}

Here's hourly:

{
  "time": datetime.datetime(2024, 11, 23, 11, 0, tzinfo=datetime.timezone.utc),
  "screenTemperature": {
    "value": 9.0,
    "description": "Screen Air Temperature",
    "unit_name": "degrees Celsius",
    "unit_symbol": "Cel"
  },
  "maxScreenAirTemp": {
    "value": 9.28,
    "description": "Maximum Screen Air Temperature Over Previous Hour",
    "unit_name": "degrees Celsius",
    "unit_symbol": "Cel"
  },
  "minScreenAirTemp": {
    "value": 8.99,
    "description": "Minimum Screen Air Temperature Over Previous Hour",
    "unit_name": "degrees Celsius",
    "unit_symbol": "Cel"
  },
  "screenDewPointTemperature": {
    "value": 6.99,
    "description": "Screen Dew Point Temperature",
    "unit_name": "degrees Celsius",
    "unit_symbol": "Cel"
  },
  "feelsLikeTemperature": {
    "value": 5.68,
    "description": "Feels Like Temperature",
    "unit_name": "degrees Celsius",
    "unit_symbol": "Cel"
  },
  "windSpeed10m": {
    "value": 6.9,
    "description": "10m Wind Speed",
    "unit_name": "metres per second",
    "unit_symbol": "m/s"
  },
  "windDirectionFrom10m": {
    "value": 185,
    "description": "10m Wind From Direction",
    "unit_name": "degrees",
    "unit_symbol": "deg"
  },
  "windGustSpeed10m": {
    "value": 14.09,
    "description": "10m Wind Gust Speed",
    "unit_name": "metres per second",
    "unit_symbol": "m/s"
  },
  "max10mWindGust": {
    "value": 16.41,
    "description": "Maximum 10m Wind Gust Speed Over Previous Hour",
    "unit_name": "metres per second",
    "unit_symbol": "m/s"
  },
  "visibility": {
    "value": 10931,
    "description": "Visibility",
    "unit_name": "metres",
    "unit_symbol": "m"
  },
  "screenRelativeHumidity": {
    "value": 87.37,
    "description": "Screen Relative Humidity",
    "unit_name": "percentage",
    "unit_symbol": "%"
  },
  "mslp": {
    "value": 100240,
    "description": "Mean Sea Level Pressure",
    "unit_name": "pascals",
    "unit_symbol": "Pa"
  },
  "uvIndex": {
    "value": 1,
    "description": "UV Index",
    "unit_name": "dimensionless",
    "unit_symbol": "1"
  },
  "significantWeatherCode": {
    "value": "Drizzle",
    "description": "Significant Weather Code",
    "unit_name": "dimensionless",
    "unit_symbol": "1"
  },
  "precipitationRate": {
    "value": 0.1,
    "description": "Precipitation Rate",
    "unit_name": "millimetres per hour",
    "unit_symbol": "mm/h"
  },
  "totalPrecipAmount": {
    "value": 0.08,
    "description": "Total Precipitation Amount Over Previous Hour",
    "unit_name": "millimetres",
    "unit_symbol": "mm"
  },
  "totalSnowAmount": {
    "value": 0,
    "description": "Total Snow Amount Over Previous Hour",
    "unit_name": "millimetres",
    "unit_symbol": "mm"
  },
  "probOfPrecipitation": {
    "value": 38,
    "description": "Probability of Precipitation",
    "unit_name": "percentage",
    "unit_symbol": "%"
  }
}

avee87 avatar Nov 23 '24 11:11 avee87

Hi,

The camelCase vs. TitleCase issue is a bug in my code which I can fix. In the 'daily' data from DataHub the names are prefixed with one of 'day', 'midday', 'night', 'midnight' which I stripped out to at least make the names consistent inside the daily forecast, but didn't correct the casing afterwards.

For the weather codes I had assumed it would be more useful to provide the string rather than the numeric, and the inconsistency is a bug. I can add an option to convert / not convert the numeric value - if you are getting the numeric value I assume you are storing the mapping somewhere else?

The differences between the daily and hourly names are because the names are different in the data returned from DataHub and there isn't always a mapping between them - for instance in hourly there isn't UpperBoundMaxFeelsLikeTemp. I get that wind speed being windSpeed10m in hourly and 10MWindSpeed in daily is a pain though! I could apply a mapping for that sort of thing, but otherwise the differences are due to what is provided from DataHub.

Perseudonymous avatar Nov 23 '24 16:11 Perseudonymous

I've released version 0.11.0 which fixes the camelCase inconsistency and adds an argument to control whether the significant weather code is mapped from a number to a string description.

Let me know what you think about mapping between the daily and hourly names.

Perseudonymous avatar Nov 26 '24 19:11 Perseudonymous

Thank you! This is much better now that I don't need to handle condition differently depending on forecast type. I also discovered that day and night forecasts have slightly different formats - i.e., "upperBoundMaxTemp" vs "upperBoundMinTemp"...

Here's my Home Assistant PR by the way: https://github.com/home-assistant/core/pull/131425

avee87 avatar Nov 26 '24 21:11 avee87

The format difference for day and night is a symptom of the data returned from DataHub combined with my decision to split each day into 12 hour time steps. I'm afraid I can't now remember why I decided that but I suspect it was to try and match previous behaviour. Arguably that was a mistake, as the DataHub API returns only one time step for each day and in that, it provides a range for the daytime max temperature and the nighttime min temperature. I've included an example API response for a single day below to illustrate this. I stripped out the prefixes to make things (I thought) a bit simpler but I think now I may have just made things more complex

If it would make your life easier I can change the library to provide one time step per day, and leave in the 'midday', 'day' etc prefixes. I'm open to any other suggestions as well!

                "time": "2024-02-16T00:00Z",
                "midday10MWindSpeed": 5.04,
                "midnight10MWindSpeed": 1.39,
                "midday10MWindDirection": 273,
                "midnight10MWindDirection": 243,
                "midday10MWindGust": 8.75,
                "midnight10MWindGust": 7.2,
                "middayVisibility": 28772,
                "midnightVisibility": 27712,
                "middayRelativeHumidity": 75.21,
                "midnightRelativeHumidity": 80.91,
                "middayMslp": 101680,
                "midnightMslp": 102640,
                "nightSignificantWeatherCode": 7,
                "dayMaxScreenTemperature": 12.82,
                "nightMinScreenTemperature": 5.32,
                "dayUpperBoundMaxTemp": 14.1,
                "nightUpperBoundMinTemp": 9.17,
                "dayLowerBoundMaxTemp": 11.97,
                "nightLowerBoundMinTemp": 3.56,
                "nightMinFeelsLikeTemp": 6.27,
                "dayUpperBoundMaxFeelsLikeTemp": 12.47,
                "nightUpperBoundMinFeelsLikeTemp": 8.74,
                "dayLowerBoundMaxFeelsLikeTemp": 10.01,
                "nightLowerBoundMinFeelsLikeTemp": 2.75,
                "nightProbabilityOfPrecipitation": 11,
                "nightProbabilityOfSnow": 0,
                "nightProbabilityOfHeavySnow": 0,
                "nightProbabilityOfRain": 10,
                "nightProbabilityOfHeavyRain": 0,
                "nightProbabilityOfHail": 0,
                "nightProbabilityOfSferics": 0
            }```

Perseudonymous avatar Nov 26 '24 21:11 Perseudonymous

Home Assistant supports hourly, daily and twice-daily forecasts. Currently, since it returns 2 forecasts per day I implemented it as twice-daily.

Maybe we could have all 3 options here and let user decide which one to use? I.e., twice-daily and daily would use the same data, just package it differently to return either 2 or 1 forecasts per day.

avee87 avatar Nov 26 '24 22:11 avee87

I can change the 'daily' format to have one forecast per day, and there I'll leave the names the same as they are from the DataHub API.

For twice-daily there is still the issue that the day and night data from the API is different. My inclination here is to split the data as I currently do, but leave the names as they are from the API so it is more clear why they are different.

Perseudonymous avatar Nov 27 '24 18:11 Perseudonymous

I think we can deal with different day/night formats in twice-daily forecasts. If anything, it helps differentiate between day and night timesteps.

avee87 avatar Nov 27 '24 18:11 avee87

I've just released version 0.12.0 with those changes in

Perseudonymous avatar Nov 27 '24 21:11 Perseudonymous

"twice-daily" option is currently broken in 0.12.0 - it tries to call a non-existing "twice-daily" endpoint: https://github.com/EJEP/datapoint-python/blob/master/src/datapoint/Manager.py#L181 It should call "daily" endpoint instead and then just parse data differently.

avee87 avatar Nov 27 '24 23:11 avee87

A rookie error on my part, I've fixed in 0.12.1

Perseudonymous avatar Nov 28 '24 12:11 Perseudonymous

Can you please release 0.12.1 on pypi?

avee87 avatar Nov 28 '24 22:11 avee87

Argh I forgot I need to press a button. It's released there now

Perseudonymous avatar Nov 28 '24 22:11 Perseudonymous