Ask-Anything
Ask-Anything copied to clipboard
Question regarding Object Interaction task of STAR & Charade dataset
{
"video": "K9UXS.mp4",
"question": "Which object was tidied up by the person?",
"candidates": [
"The table.",
"The clothes.",
"The towel.",
"The blanket."
],
"answer": "The blanket.",
"start": 2.65134176537509,
"end": 30.34865823462491,
"accurate_start": 22,
"accurate_end": 31
},
I checked that "accurate_start" and "accurate_end" fields came from "start", "end" field of original STAR annotation. Then, where did 'start', 'end' file come from? I can't find any information from the original annotation.
- Same situation in action localization of Charade dataset
{
"video": "ONMCW.mp4",
"question": "In the given video, when does the action 'person takes a glass from the desk' take place?",
"candidates": [
"At the end of the video.",
"At the beginning of the video.",
"In the middle of the video.",
"Throughout the entire video."
],
"answer": "In the middle of the video.",
"start": 15.299999999999997,
"end": 33.6,
"accurate_start": 21.4,
"accurate_end": 27.5
},
Good question! The start and end are randomly generated based on the accurate start and end, since we hope to improve the question difficulty.