something_else Unable to run Something-Else models on Original Something Something V2

I am trying to reproduce the experiment numbers from Table 2 in the paper. I am trying to run STIN on the original SSv2 dataset. However, I get a KeyError, shown below. I get a KeyError also when I try to run I3D on the original SSv2 dataset. My SSv2 dataset frames, the train, validation, and labels json files, were taken directly from 20BN's website. I also got the bounding boxes json file directly from the Google Drive. I would appreciate any advice on how to overcome this issue.

` Namespace(batch_size=72, ckpt='./ckpt', clip_gradient=5, coord_feature_dim=512, dataset='smth_smth', epochs=50, evaluate=False, fine_tune=None, img_feature_dim=256, json_data_train='/path/smthv2/something-something-v2-train.json', json_data_val='/path/smthv2/something-something-v2-validation.json', json_file_labels='/path/smthv2/something-something-v2-labels.json', log_freq=10, logdir='./logs', logname='coord-org', lr=0.01, lr_steps=[24, 35, 45], model='coord', momentum=0.9, num_boxes=4, num_classes=174, num_frames=8, print_freq=20, restore_custom=None, restore_i3d=None, resume='', root_frames='/path/smth-smth-v2/frames/', shot=5, size=224, start_epoch=None, tracked_boxes='/path/smthv2/annots/bounding_box.json', weight_decay=0.0001, workers=28)

Loading label strings Loading label strings ... Loading box annotations might take a minute ... Loading label strings ... Loading box annotations might take a minute ... ######################## logging outputs to ./logs/coord-org ########################

Traceback (most recent call last): File "train.py", line 353, in main() File "train.py", line 170, in main train(train_loader, model, optimizer, epoch, criterion, tb_logger) File "train.py", line 205, in train for i, (global_img_tensors, box_tensors, box_categories, video_label) in enumerate(train_loader): File "/xxx/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 435, in next data = self._next_data() File "/xxx/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data return self._process_data(data) File "/xxx/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data data.reraise() File "/xxx/anaconda3/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise raise self.exc_type(msg) KeyError: Caught KeyError in DataLoader worker process 0. Original Traceback (most recent call last): File "/xxx/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop data = fetcher.fetch(index) File "/xxx/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/xxx/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/xxx/something_else/code/data_utils/data_loader_frames.py", line 293, in getitem frames, box_tensors, box_categories = self.sample_single(index) File "/xxx/something_else/code/data_utils/data_loader_frames.py", line 195, in sample_single video_data = self.box_annotations[folder_id] KeyError: '124909'`

Jan 22 '21 21:01 RishiDesai

The bounding box annotation provided in the Google Drive does not cover all videos in the original sth-sth v2 dataset. It only covers those data included in their proposed data split (i.e. a subset from the original sth-sth dataset).

So from the last line in your error message, I guess the dataloader simply can't find the annotations for video 124909 in the annotation json.

Mar 26 '21 13:03 meifish

The bounding box annotation provided in the Google Drive does not cover all videos in the original sth-sth v2 dataset. It only covers those data included in their proposed data split (i.e. a subset from the original sth-sth dataset).

So from the last line in your error message, I guess the dataloader simply can't find the annotations for video 124909 in the annotation json.

+10086 @xiaolonw @joaanna

May 24 '21 09:05 pokameng

The bounding box annotation provided in the Google Drive does not cover all videos in the original sth-sth v2 dataset. It only covers those data included in their proposed data split (i.e. a subset from the original sth-sth dataset).

So from the last line in your error message, I guess the dataloader simply can't find the annotations for video 124909 in the annotation json.

hello,can you tell me why the train.json and the validaton.json in total have 112798(54919+57876) items less then the bounding_box_smthsmth_part1-4.json's items 180049 and why the train.json'number and the validaton.json'number approximate 1:1 ??? i think that is not conform to common sense

Dec 31 '21 03:12 moonlight52137