Question About thr Sampler Indices
Hi thank you for your code. I am reading your semi_sampler.py and find it a little bit confusing.
https://github.com/microsoft/SoftTeacher/blob/bef9a256e5c920723280146fc66b82629b3ee9d4/ssod/datasets/samplers/semi_sampler.py#L170
Why do you need to concatenate all the indices? I think the last indice already contains all the images in the dataset. And after concatenation, the indices become a list of #max_iteration arrays. I really appreciate your reply.
The last indice only contains one group of images in the dataset. However, each dataset usually contain two groups( w<h and w>h).
The last indice only contains one group of images in the dataset. However, each dataset usually contain two groups(
w<handw>h).
Thank you for your help!
I am not so familiar with mmdet, and I understand your code as: If the ratio is 1 and batch size is 2, the code works like: After creating the data loader that reading separately from labeled data and unlabeled data, one batch sampled from the dataset is contained of (<augmented labeled img, label>, <weak augmented unlabeled img, meaningless label>, <strong augmented unlabeled img, meaningless label> ) Although I am not so sure what happened in https://github.com/microsoft/SoftTeacher/blob/bef9a256e5c920723280146fc66b82629b3ee9d4/ssod/apis/train.py#L206 but I think the batched data is sent to the model and trained as in SoftTeacher.py.
Could you please point out my mistakes and give me some advice?
Yes, it works just like your description.
Yes, it works just like your description.
Thanks!
Yes, it works just like your description.
Hi sorry to bother you again, but I have a few more questions and your help is appreciated.
After using ordinary augmentation for labeled data, and using MultiBranch augmentation to generate the weak and strong augmented unlabeled images in
https://github.com/microsoft/SoftTeacher/blob/bef9a256e5c920723280146fc66b82629b3ee9d4/ssod/datasets/pipelines/rand_aug.py#L953-L965
, I think the labeled data format is a dict of {"img_metas":XXX, "img":XXX, "gt_bboxes":XXX, "gt_labels":XXX} , but the format of unlabeled data is a list of two augmented version like [ {"img_metas":XXX, "img":XXX, "gt_bboxes":XXX, "gt_labels":XXX} , {"img_metas":XXX, "img":XXX, "gt_bboxes":XXX, "gt_labels":XXX} ].
My questions are:
- Am I right about the data format? Did I omit some important functions? Because I found MultiBrach is also noted in this function https://github.com/microsoft/SoftTeacher/blob/bef9a256e5c920723280146fc66b82629b3ee9d4/tools/misc/browse_dataset.py#L50-L73 , but I am not so clear what this function is for.
- If I am right about the data format, how do you collate these two formats into a batch? Could you please point it out in the code?
I really appreciate your reply cause I am really lost...
See here. https://github.com/microsoft/SoftTeacher/blob/bef9a256e5c920723280146fc66b82629b3ee9d4/ssod/datasets/builder.py#L161
It will flatten the nested list to a one dimension list like [a,[b,c]]->[a,b,c].
See here.
https://github.com/microsoft/SoftTeacher/blob/bef9a256e5c920723280146fc66b82629b3ee9d4/ssod/datasets/builder.py#L161
It will flatten the nested list to a one dimension list like
[a,[b,c]]->[a,b,c].
Thank you so much!