TSAR icon indicating copy to clipboard operation
TSAR copied to clipboard

Confusion about RAMS datasets

Open bellytina opened this issue 3 years ago • 2 comments

微信图片_20220731140404 Hello, Thanks for your code! I see #Doc is much smaller than #Event from Table 1, indicating that a document can contain multiple events. So is there a clear boundary between these events, that is, whether different events under the same document will share arguments? In addition, I found that the doc_key of each instance in the jsonlines is unique. How do you count the number of documents (3194,399 and 400)? Any help would be great.

bellytina avatar Jul 31 '22 06:07 bellytina

Hi @bellytina The #Doc should be based on the "source_url". The #Events is based on the "doc_key"

jefflink avatar Aug 03 '22 03:08 jefflink

Thank @jefflink, and that is exactly the answer. @bellytina

RunxinXu avatar Aug 16 '22 11:08 RunxinXu