Weed-AI icon indicating copy to clipboard operation
Weed-AI copied to clipboard

Should we allow custom categories?

Open jnothman opened this issue 4 years ago • 4 comments

For example, we could allow for categories to be arbitrarily defined within a dataset, e.g.

{
    "name": "Grasses and Sedges", /* custom name, unique within a dataset */
    "role": "weed",
    "included_taxa": [
        {"name": "poaceae", "eppo": "1GRAF"},
        {"name": "cuperus", "eppo": "1CYPG"},
    ],
    "annotation_specification": "The annotations of these species should fully include their growing point, and not any leaves visibly protruding from that growing point.",
}

included_taxa would make it easier to share datasets with practical granularity of annotation, without trying to fit a square peg into a round hole. It makes it a little harder to use datasets, and would make it harder to combine multiple datasets. Instead of a 1-to-1 category mapping UI, we would need a more sophisticated UI for defining rich categories.

We could also consider excepted_taxa or excluded_taxa when we know some species are ignored or annotated under a different category.

How do we represent "broadleaves"?

annotation_specification would help us describe the intentions during annotation, particularly when broadleaves and grasses get different treatment for segmentation.

jnothman avatar Dec 05 '21 22:12 jnothman

I like and don't like this. I don't like it mostly because it breaks the standards of the platform and isn't very granular, but I think it better addresses the reality of what people are likely to annotate and what we've seen annotated and is a good addition in flexibility to the site.

This approach would be acceptable only if it's possible to group by the hierarchy listed in included_taxa as you've described, the custom name is fairly arbitrary, though we could have naming standards for combination datasets. Alternatively, can standardised grouped class names be generated automatically if included_taxa are provided? I.e. we provide a dictionary mapping high level family names to accepted groups.

The broadleaves class is is an interesting one, they are more technically known as forbs (defined as non-graminoid plants) or dicots I guess. Wikipedia provides a list of orders/families so it would be possible to automatically group them based on taxonomical hierarchy.

image

annotation_specification is an important addition.

geezacoleman avatar Dec 06 '21 18:12 geezacoleman

Plant parts could also come into a richer category spec.

But I'm not sure how we'd display or search these categories on the explore page....

jnothman avatar Dec 06 '21 20:12 jnothman

I will admit that not knowing how to display/search this is not a good reason to not "address the reality of what people are likely to annotate".

jnothman avatar Dec 06 '21 21:12 jnothman

We can also note instance vs semantic segmentation

jnothman avatar Dec 22 '21 23:12 jnothman