pyvo icon indicating copy to clipboard operation
pyvo copied to clipboard

Enumerate tables from a TAP service by schema?

Open seweissman opened this issue 6 years ago • 4 comments

Is it possible to get at schema information for a table when enumerating tables via TAPService? I have only been able to get schema information via tap_service.tables._vosi_tables.tableset.schemas (see full example below). I worry I am missing something obvious.

In the endpoint I'm accessing the table names always begin with the schema, so I can just do this via string manipulation, but I'm not sure if this is always the case.

import pyvo as vo

def print_column_information(column):
    print("Name:", column.name)
    print("Description:", column.description)
    print("Unit:", column.unit)
    print("Datatype:", column.datatype.content)

def print_table_information(table, print_columns=False):
    # This doesn't work:
    # print("Schema:", table.schema)
    print("Table:", table.name, table.type, table.utype)
    print("Table description:", table.description)
    if print_columns:
        for column in table.columns:
            print_column_information(column)

tap_service = vo.dal.TAPService("http://vao.stsci.edu/zmast/tapservice.aspx")

# Can get to schemas via internal _vosi_tables member, but this seems like a hack
for schema in tap_service.tables._vosi_tables.tableset.schemas:
    if schema.name != "dbo":
        continue
    print("Schema:", schema.name, type(schema))
    for table in schema:
        print_table_information(table)

# Can get to tables like this, but tables seem schema-unaware
for table in tap_service.tables:
    print_table_information(table)

seweissman avatar Dec 18 '19 23:12 seweissman

This is an interesting one. After a quick look, I think the reason why you're digging so deep into the private object tree is that is the where the schemas are in the XML document. Some of these classes in VO mirror their part in the document tree. So there's schemas, and inside of that is a list of tables (and then columns, etc.) I guess because most people think about the tables? (and the name of the endpoint is of course, tables)

You could also use the TAP_SCHEMA datbase available on most TAP services, which is a database to query against for the metadata using TAP itself, so you query in SQL to learn about the metadata. You can do this with a TAP query like:

SELECT * from TAP_SCHEMA.tables SELECT * from TAP_SCHEMA.schemas SELECT * from TAP_SCHEMA.columns

It sounds like if we had the schema as a property on a table (which schema that table belongs to) that would help you iterate through, then at least you could filter by schema at that level. Does that seem like something that would work for you?

cbanek avatar Dec 19 '19 23:12 cbanek

On Thu, Dec 19, 2019 at 03:17:12PM -0800, Christine Banek wrote:

This is an interesting one. After a quick look, I think the reason why you're digging so deep into the private object tree is that is the where the schemas are in the XML document. Some of these

Well, I'd say the rough use cases here are "group tables in a sensible way" and "access schema description to learn about what the table group actually is about".

Coverage for both of these I'd consider desirable; in the current structure, I'd say the right way to support them would be to have a schema attribute on table pointing to a representation of a VODataService schema element, http://docs.g-vo.org/schemadoc/schemas/VODataService-v1_1_xsd/elements/schema.html

I suspect the reason Stefan didn't write it like that from the start is that VODataService 1.0 (which never made it to REC) didn't have schema, and several relevant services (Simbad and VizieR come to mind) don't actually have meaningful (at least to people outside of the projects) schema content.

Anyway, a quick look at the relevant classes didn't give me an immediate hook to make that happen in just a few lines (but then I'm largely unfamilar with that code). Perhaps Stefan could help out?

Meanwhile:

You could also use the TAP_SCHEMA datbase available on most TAP services, which is a database to query against for the metadata using TAP itself, so you query in SQL to learn about the metadata. You can do this with a TAP query like:

Actually, TAP_SCHEMA is mandatory for TAP services, while VOSI tables is not. Using TAP_SCHEMA from TAP is probably the simplest and most straightforward way to do what you seem to be after.

msdemlei avatar Dec 20 '19 08:12 msdemlei

Perhaps Stefan could help out?

Maybe between Christmas and New Year

funbaker avatar Dec 20 '19 10:12 funbaker

Hmm. Due to dynamic loading of tables it is not always trivial to get a dedicated schema. According to the xml schema it is fine to split by the dot though. So this is a valid, but not very clean solution.

funbaker avatar Jan 28 '20 13:01 funbaker