simphony-osp icon indicating copy to clipboard operation
simphony-osp copied to clipboard

Data source parser/reader

Open yoavnash opened this issue 4 years ago • 9 comments

So far we had wrappers for simulations and for data sink (to store data). We still don't have a wrapper for extracting data from an existing data repo. Such a wrapper would be beneficial for demos and various applications.

Suggestions:

  • NOMAD Wrapper
  • OPTIMADE wrapper
  • Standard data formats. For example, STL for geomatrical representation.

Could be related to #267 and #290

yoavnash avatar Apr 28 '21 16:04 yoavnash

Do we just want to extract data? Because then I would not use a wrapper, since both simulations and data storage can be used for communication in both directions which I think is part of what makes wrappers wrappers

pablo-de-andres avatar Apr 28 '21 16:04 pablo-de-andres

Good point. It can still be beneficial though to have a "read-only" wrapper, for direct access to data in an OSP-core script.

yoavnash avatar Apr 28 '21 16:04 yoavnash

I think would define another component at the same level as wrappers. A parser/reader, if you will. And we could divide between data repos, excel files and other formats...

pablo-de-andres avatar Apr 28 '21 16:04 pablo-de-andres

That sounds indeed better. Great suggestion.

yoavnash avatar Apr 28 '21 16:04 yoavnash

It might make for a nice symmetric design if we think of readers, writers and wrappers

pablo-de-andres avatar Apr 29 '21 11:04 pablo-de-andres

It might make for a nice symmetric design if we think of readers, writers and wrappers

Definitely agree.

In case of the GMSHWrapper, I followed a separation between engine, (which takes the role of a reader and writer) and the actual wrapper by two different classes.

According to my experience with this module from the FORCE-project, if there is a syntactic third-party-tool already available, the reader and writer part might be quite easy. The semantic interpretation through a wrapper and the related ontology is much more timeconsuming, as we all now...

MBueschelberger avatar May 26 '21 10:05 MBueschelberger

Could be related to mapping data from tables to RDF. Related technologies:

  • OntoRefine: https://openrefine.org/ , https://graphdb.ontotext.com/documentation/free/loading-data-using-ontorefine.html
  • YARRRML
  • SPARQL-Integrate: https://github.com/SmartDataAnalytics/RdfProcessingToolkit/blob/master/README-SI.md

yoavnash avatar May 31 '21 17:05 yoavnash

I am labelling this also as breaking change as we might want to introduce a new API or do changes to the wrapper API for this.

kysrpex avatar Jun 09 '21 06:06 kysrpex

Related #651.

kysrpex avatar Jun 09 '21 09:06 kysrpex