specification icon indicating copy to clipboard operation
specification copied to clipboard

The InBetween relation for xsd year, month, etc.

Open pietercolpaert opened this issue 1 year ago • 5 comments

This is a proposal that I’ve seen being used in a tree:Collection already that could make sense to specify properly: use a new kind of relation tree:InBetweenRelation to compare more specific time-based XSD literals.

Mind that SPARQL explicitly does not allow comparing across xsd literals.

However, we could imagine to indicate that a member that has xsd:dateTime literals, could be in the year 2024 to be described as follows:

ex:R1 a tree:InBetweenRelation;
             tree:node   <?year=2024>;
             tree:path   prov:generatedAtTime;
             tree:value  "2024+00:00"^^xsd:gYear .

Timezones are tricky to process though: we would need to make mentioning the timezone mandatory. If it’s not set, we would need (can be a documented fallback) to interpret it as anywhere on earth, which would make sure there is a half day of overlap of the years 2023 and 2024 for example.

Brought up again by @smessie

Related issue: #82 /cc @xdxxxdx

What we’d need: a full list of all possible datatypes that need to be supported and the mapping between how to compare them.

pietercolpaert avatar Sep 17 '24 12:09 pietercolpaert

What we’d need: a full list of all possible datatypes that need to be supported and the mapping between how to compare them.

  • xsd:gYear maps to
    • {year}-01-01T00:00:00
    • < {year+1}-01-01T00:00:00
  • xsd:gYearMonth maps to
    • {year}-{month}-01T00:00:00
    • < {year}-{month+1}-01T00:00:00
  • xsd:date maps to
    • {year}-{month}-{day}T00:00:00
    • < {year}-{month}-{day+1}T00:00:00

An important note on timezones:

  • When casting a xsd:gYear to a period, we must assume the worst-case bounds when no timezone is set. As data on the web can come from various sources, we cannot assume a specific timezone
  • The TREE hypermedia spec needs to make clear to in order to be a lot more useful, servers should set timezones in their time-related literals.

smessie avatar Sep 17 '24 13:09 smessie

Relevant SPARQL enhancement proposal: https://github.com/w3c/sparql-dev/blob/main/SEP/SEP-0002/sep-0002.md

As far as I can see it does however not propose how to compare a year to a date with and without timezone information, although there is an open discussion on this https://github.com/w3c/sparql-query/issues/116

I believe the rules should be: if there is no timezone set, the literal becomes a period with the worst-case bounds. You can only say that something is before or after (> or <), if it does not overlap. You can also not assert equality between a dateTime and a period. In order to assert something is contained within a period, another operator will be necessary.

pietercolpaert avatar Oct 18 '24 08:10 pietercolpaert

Shortly discussed this during the 23d TREE CG meeting today:

@constraintAutomaton: made the comment that this does not add any functionality and only adds sugar to something that is already possible as the use cases can be achieved by describing two relations with the bounds.

I then commented I agreed as probably specifying the exact implementation details might get tedious, as they are not even standardized in SPARQL and/or XPath yet.

@constraintAutomaton: Also, these relations are for machines, not for humans, so just having another more readable way to say something might not be the best idea.

pietercolpaert avatar Nov 06 '24 13:11 pietercolpaert

Hi @pietercolpaert, @constraintAutomaton I agree the tree:InBetween relation might be redundant, as it only applies time granularity to what has already been defined by a specific time type. A good example would be paintings or antiques, where exact time granularity is often absent. For instance, an archaeological finding dated between 210 B.C. and 52 B.C. cannot be accurately represented with existing time granularities, as it doesn't fit neatly into a decade, century, or other defined units—it's simply a certain period of time. Using two precise timestamps would seem more logical in such cases.

xdxxxdx avatar Nov 06 '24 13:11 xdxxxdx

So the consensus here I think becomes to remove the tree:InBetweenRelation from the vocabulary to avoid confusion: it should not be used at all (it was already not mentioned in the spec anymore, so the was no client behaviour attached to it).

pietercolpaert avatar Apr 12 '25 10:04 pietercolpaert