api-python icon indicating copy to clipboard operation
api-python copied to clipboard

dataCommons.org: Data Commons Knowledge Graph (DCKG)

Open ElwinHuaman opened this issue 7 years ago • 2 comments

Hey all,

I was challenged last week to provide info(in rough numbers) about the Data Commons Knowledge Graph(DCKG), which was constructed by synthesizing in a single Knowledge Graph from many different data sources[1]. What I am looking for especially is to know:

- How many entities or nodes the DCKG has?, understanding that dcid (DataCommons identifier) is a unique identifier assigned to each entity in the knowledge graph, furthermore entities are represented by nodes[2]. - How many data sources the DCKG has?, because currently contains data from Wikipedia, the US Census, NOAA, FBI, etc?[3]. - How many nodes and relations the DCKG has? and How many statements it has?

  • For example, the statement "Santa Clara County is contained in the State of California" is represented in the graph as two nodes: "Santa Clara County" and "California" with an edge labeled "containedInPlace" pointing from Santa Clara to California.

- What is the current size of the used vocabulary in the DCKG?, taking into account that dataCommons.org builds upon on the vocabularies defined by Schema.org[4] - These are potential FAQs for future researchers (of course there are more)

Could you help me?

cheers, Elwin Huaman

[1] https://browser.datacommons.org/ [2] https://colab.research.google.com/drive/1vffnWktZyffk7pNfpuXrTsCpp-od5W47 [3] https://datacommons.org/ [4] https://datacommons.org/faq

ElwinHuaman avatar Nov 19 '18 19:11 ElwinHuaman

Excellent questions. Have you received this information via another channel?

peter

pfps avatar Mar 12 '19 19:03 pfps

Excellent questions. Have you received this information via another channel?

peter

No, sorry.

Elwin

ElwinHuaman avatar Mar 19 '19 10:03 ElwinHuaman