dataCommons.org: Data Commons Knowledge Graph (DCKG)
Hey all,
I was challenged last week to provide info(in rough numbers) about the Data Commons Knowledge Graph(DCKG), which was constructed by synthesizing in a single Knowledge Graph from many different data sources[1]. What I am looking for especially is to know:
- How many entities or nodes the DCKG has?, understanding that dcid (DataCommons identifier) is a unique identifier assigned to each entity in the knowledge graph, furthermore entities are represented by nodes[2]. - How many data sources the DCKG has?, because currently contains data from Wikipedia, the US Census, NOAA, FBI, etc?[3]. - How many nodes and relations the DCKG has? and How many statements it has?
- For example, the statement "Santa Clara County is contained in the State of California" is represented in the graph as two nodes: "Santa Clara County" and "California" with an edge labeled "containedInPlace" pointing from Santa Clara to California.
- What is the current size of the used vocabulary in the DCKG?, taking into account that dataCommons.org builds upon on the vocabularies defined by Schema.org[4] - These are potential FAQs for future researchers (of course there are more)
Could you help me?
cheers, Elwin Huaman
[1] https://browser.datacommons.org/ [2] https://colab.research.google.com/drive/1vffnWktZyffk7pNfpuXrTsCpp-od5W47 [3] https://datacommons.org/ [4] https://datacommons.org/faq
Excellent questions. Have you received this information via another channel?
peter
Excellent questions. Have you received this information via another channel?
peter
No, sorry.
Elwin