sunpy-soar icon indicating copy to clipboard operation
sunpy-soar copied to clipboard

Query SOAR metadata

Open ebuchlin opened this issue 3 years ago • 19 comments

Describe the feature

My understanding is that sunpy-soar currently only supports queries by instrument / time / level / product, as this is basically what is available in the SOAR web query form and in the v_sc_data_item and v_ll_data_item tables. However, the user should also be able to do queries with different metadata (other Fido attributes).

Proposed solution

The list of all tables and their columns is available from SOAR with TAP. I attach a human-readable version (tree by schema / table / column), generated by XSLT with this XSL stylesheet.

This shows that more complete metadata are available in SOAR, in instrument-specific tables, e.g. v_spi_sc_fits. Example query: http://soar.esac.esa.int/soar-sl-tap/tap//sync?REQUEST=doQuery&LANG=ADQL&FORMAT=json&QUERY=SELECT+TOP+10+%2A+FROM+v_spi_sc_fits

Fido attributes should be linked to columns in these different instrument-specific tables. For a query with multiple instruments, multiple tables should be queried... (or should this not be supported?). LL files metadata can also be queried, from still different tables.

ebuchlin avatar Aug 24 '22 15:08 ebuchlin

There is a draft documentation for the tables, views, and columns of the SOAR TAP interface: https://www.cosmos.esa.int/web/soar/tables-views-and-columns

ebuchlin avatar Feb 05 '23 11:02 ebuchlin

There are now new columns soop_name and soop_type available in the SOAR TAP interface.

ebuchlin avatar May 10 '23 15:05 ebuchlin

There are now new columns soop_name and soop_type available in the SOAR TAP interface.

Is this covered by #84?

wtbarnes avatar May 16 '23 17:05 wtbarnes

There are now new columns soop_name and soop_type available in the SOAR TAP interface. Is this covered by #84?

For queries by SOOP name it seems that #84 covers it, yes.

ebuchlin avatar May 17 '23 09:05 ebuchlin

@ebuchlin sorry for the lack of traffic on this issue. I definitely agree we should be supporting more complex queries against the SOAR through Fido, but I'm a bit confused as to the scope. Looking at the docs you linked above, it is not quite clear to me what attributes should be supported through the attrs interface. Could you provide an example of what a Fido query would look like with these additional metadata?

The example from @hayesla in #66 makes it a bit more clear to me, but again the issue is what subset of that metadata we should support. I don't think it is practical to try and translate each bit of SOAR metadata to a Fido attribute. However, maybe there could be some sort of interface to specifying these filters as strings, similar to what we allow with JSOC keywords in the sunpy.net.jsoc.attrs.

wtbarnes avatar May 18 '23 14:05 wtbarnes

This is a generic issue meant to tell that there were more possibilities with the SOAR TAP interface than the ones initially used by sunpy-soar (the details of the TAP interface were undocumented at that time). Now that we have some documentation and that queries by SOOP name have been implemented, we can be more specific about the potentially other useful attributes, starting from existing sunpy.net ones:

  • Detector: from the v_<instrument>_<ll/sc>_fits tables, column detector. Partially overlaps the use cases for a.soar.Product.
  • Wavelength: from the v_<instrument>_<ll/sc>_fits tables, column wavelength
    • SPICE windows wavelengths (one range per window) are not all accessible through SOAR, I think that only the first window is, unless the full list is in the undocumented v_<instrument>_<sc/ll>_extension_fits tables for <instrument> = SPICE (just a guess).
    • STIX rather has energy bands
  • Resolution: for AIA and HMI, this is a factor from the highest resolution. There is some information in the cdelt[n], total_binning_factor and binning_factor columns (in one row per dimension), but not sure how to combine this into something meaningful, and consistent with the existing meaning for AIA and HMI. Also, should this be limited to spatial resolution?
  • Phyobs: not in SOAR; could be deduced from a.soar.Product?
  • Extent (as in sunpy.net.vso.attrs) could be useful, but the meaning should be clarified (some overlap with FOV?) and it might be difficult to implement.

An issue is that v_<instrument>_<ll/sc>_fits is actually multiple tables, one per instrument and per data type (low-latency or science), and that there must then be join operations with the v_<ll/sc>_data_item tables.

In case one would like to have access to previous versions of the files (instead of only the latest version), the v_<ll/sc>_repository_file tables would also have to be considered.

For a start, we can of course ignore previous versions of files, ignore low-latency observations, and prioritize attributes in the above list. The efforts should also be balanced with those put on access to Solar Orbiter data through VSO as data provider.

ebuchlin avatar May 22 '23 20:05 ebuchlin

For complex SOAR TAP queries, here is a tutorial on TAP queries that we did at IAS; it could provide ideas for how to do some of the queries we would like to be doable using Fido.

ebuchlin avatar Jan 17 '24 11:01 ebuchlin

@ebuchlin we are going to add this as a GSoC project and I have a really rough draft here: https://github.com/OpenAstronomy/openastronomy.github.io/pull/350/files#diff-03a99800468bb348b3741103deee0d442348ced2997c4a20c1aa6479cd7729e9

If you had time could you review it and would you be willing to help with the project in an advisory capacity?

nabobalis avatar Jan 18 '24 19:01 nabobalis

I have added a small comment to the GSoC project, and yes I am willing to help.

ebuchlin avatar Jan 19 '24 09:01 ebuchlin

Thanks!

nabobalis avatar Jan 19 '24 18:01 nabobalis

Hey! This issue is part of the GSoC projects. I would like to work on it, and with the organisation in general. I am new to working with open-source projects, but I will try my best to help. Where should I start?

MetaphorC avatar Mar 14 '24 17:03 MetaphorC

Hey! This issue is part of the GSoC projects. I would like to work on it, and with the organisation in general. I am new to working with open-source projects, but I will try my best to help. Where should I start?

Welcome and glad to hear you are interested in contributing.

We recommend that everyone starts with reading https://docs.sunpy.org/en/latest/dev_guide/contents/newcomers.html to get started. This will walk you through getting a development environment setup. When that is complete, the next step is to start tackling some good first issues which are linked in that guide.

Our GSoC advice is on our Wiki: https://github.com/sunpy/sunpy/wiki/Google-Summer-of-Code

If you have any questions or problems do please let us know but we encourage all communications to occur on our public chat room; https://matrix.to/#/#sunpy:openastronomy.org

nabobalis avatar Mar 14 '24 17:03 nabobalis

Understood, I'll go through this tonight, and joining the chatroom. Looking forward to working with everyone! Thank You!

MetaphorC avatar Mar 14 '24 17:03 MetaphorC

Hi, I have seen this enhancement while going through GSoC projects overall this initiative to find astronomical data is interesting I went through your metadata and felt that we can query SOAR by many other attributes also. As I completed my data analysis course just now I would like to work on this project will come up with initial draft of feature-design in 1 day Thanks, Dhruvkumar Patel

Dhruvkumar0463 avatar Mar 29 '24 07:03 Dhruvkumar0463

Hi, I have seen this enhancement while going through GSoC projects overall this initiative to find astronomical data is interesting I went through your metadata and felt that we can query SOAR by many other attributes also. As I completed my data analysis course just now I would like to work on this project will come up with initial draft of feature-design in 1 day Thanks, Dhruvkumar Patel

Hello @Dhruvkumar0463, as I said to MetaphorC above, reading that and following the links to get setup and familiar with how we do GSoC would be better. I will say, there are 3 days left and that is a tight turnaround.

nabobalis avatar Mar 29 '24 15:03 nabobalis

Following discussion - a good place to start will be to try look at adding the Detector attribute for EUI. This will be a good test case to figure out the way we plan to join tables etc.

Myself and @ebuchlin will think of attributes users of Solar Orbiter would want to query over etc before the next meeting

hayesla avatar May 22 '24 15:05 hayesla

Thanks Laura .. Will look into this and get back to you

On Wed, May 22, 2024, 9:26 PM Laura Hayes @.***> wrote:

Following discussion - a good place to start will be to try look at adding the Detector attribute for EUI. This will be a good test case to figure out the way we plan to join tables etc.

Myself and @ebuchlin https://github.com/ebuchlin will think of attributes users of Solar Orbiter would want to query over etc before the next meeting

— Reply to this email directly, view it on GitHub https://github.com/sunpy/sunpy-soar/issues/46#issuecomment-2125144654, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQ5W7LSDPMJAPHUFWH44ENDZDS5ZLAVCNFSM57P2Y5Y2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMJSGUYTINBWGU2A . You are receiving this because you were mentioned.Message ID: @.***>

Dhruvkumar0463 avatar May 22 '24 18:05 Dhruvkumar0463

Hello @ebuchlin and @hayesla - I would just like to comment that if you find the current structure of Tables in SOAR TAP difficult to work with, then please do make suggestions about how we can improve that. :-)

They are currently structured with the internal relational database in mind. But of course, we are always open to the possibility of making more user friendly views to combine data etc. This would help avoid making complex queries with joins which are often slow due to lack of indexes etc on certain columns.

It would be great to capture this kind of feedback which would surely benefit the whole community of SOAR TAP users.

Many thanks, Jonathan Cook (I am using a shared ESDC github account we have)

esdcheliodevops avatar Jun 20 '24 07:06 esdcheliodevops

Hello, here is a new analysis (notebook PDF, notebook source) of what could be done with the following keywords, with the current way they are filled (even when most of them are optional keywords) by the instrument teams and/or the SOAR:

  • sensor could be used for:
    • EUI (values: FSI174, FSI304, HRI, HRI1216, HRI174; detector would be better)
    • Metis (values: UV, VL)
    • PHI (values: FDT, HRT)
    • Not filled for SoloHI and STIX, wrong values for SPICE
  • detector could be used for:
    • EUI (values: FSI, HRI_EUV, HRI_LYA)
    • Metis (values: UV, VL; same as sensor)
    • PHI (values: FDT, HRT, and a few probably erroneous vavues)
    • SoloHI (values: 1, 2, 3, 4), I guess that these correspond to different parts of the FOV, but not sure.
    • Not filled for STIX, and is the detector of the first spectral window only for SPICE (then not relevant for the file as a whole).
  • telescope is always in the form SOLO / instrument / detector, so redundant with instrument and detector.
  • btype could be useful in principle when doing multi-instrument searches, if the values where standardized, but not much when doing searches on a specific instrument (except for PHI?):
    • EUI (value: Flux)
    • Metis (values: Stokes I, UV Lyman-alpha intensity, VL fixed-polarization intensity, VL polarization angle, VL polarized brightness, VL total brightness; are these values standardized?)
    • PHI (values: Intensity, Magnetic Field Strength, and a few files with other values)
    • Not filled for SoloHI and STIX; correspond to first window only for SPICE (values: Radiance, Spectral Radiance)
  • filter is not filled, or is redundant with sensor for Metis, or is filled and meaningful for EUI but probably not very useful for end users.
  • observation_mode could be used for:
    • EUI (16 values over the considered period)
    • Metis (9 values)
    • SoloHI (8 values)
    • SPICE (101 values). However these values are potentially too many (and growing) for the users to use in a meaningful way. The list of values will have to be updated in some way (attributes in sunpy-soar updated at a new sunpy-soar release; attributes that can be updated by the user; or list maintained online).
    • 1 value for PHI which is not useful, 2 values (probably not relevant for user) for STIX.

ebuchlin avatar Jul 31 '24 14:07 ebuchlin