supervision `tracker_id=None` when merging a sequence of detections `det_1`, `det_2`, ..., starting with an initial empty `Detection`

          I had a similar issue when merging a sequence of detections   `det_1`, `det_2`, ..., starting with an initial empty `Detection`:

dets_all = Detections.empty()
for det in ...:
    dets_all = Detections.merge([dets_all, det])

The empty detection doesn't have a tracker_id:

https://github.com/roboflow/supervision/blob/e16cca43497a6274ba3e15c3ad4913a88978f753/supervision/detection/core.py#L345-L364

For all subsequent detections, when merged (concatenated is a better name), tracker_id seems to be getting set to None here:

https://github.com/roboflow/supervision/blob/e16cca43497a6274ba3e15c3ad4913a88978f753/supervision/detection/core.py#L366-L413

Originally posted by @zburq in https://github.com/roboflow/supervision/issues/88#issuecomment-1539659760

May 09 '23 09:05 SkalskiP

@zburq I assume that other Detections have tracker_id that is not None. And that information is lost after merge?

May 09 '23 09:05 SkalskiP

@SkalskiP , yes have a tracker_id is not None condition in the loop, and can print each tracker_id.

May 09 '23 09:05 zburq

@SkalskiP , I tried adding an empty np.ndarray to my det_initial = Detections.empty():

det_initial = Detections.empty()
det_initial.tracker_id = np.empty(shape, dtype=np.int64)

but this failed a validation.

For now I have a hacky workaround:

dets_all = None
for det in ...:
    if dets_all is None:
        dets_all = det
    else:
        dets_all = Detections.merge([dets_all, det])

May 09 '23 10:05 zburq

What do you mean by but this failed a validation? Could you share the error?

May 09 '23 11:05 SkalskiP

Yes certainly. I instantiate an empty Detections object with the intention of appending other detection objects to it using Detections.merge.

dets_all = Detections.empty()
dets_all.tracker_id = np.ndarray([])

I also tried dets_all.tracker_id = np.empty_like([0]). Next, I generate a sequence of detections from a YOLO V8 output object:

results = volo_v8_model.track(source=my_vid.mp4, 
                          tracker='bytetrack.yaml', 
                          stream=True, 
                          classes=
for result in results:
    dets = Detections.from_yolov8(result)
    if result.boxes.id is not None: 
        dets.tracker_id = result.boxes.id.cpu().numpy().astype(int)
        dets_all = Detections.merge([dets_all, dets])

At the first call of merge, i.e., Detections.merge([empty_det, non_empty_det]), I get this error from merge:

---> 33             dets_all = Detections.merge([dets_all, det])
     34             print(dets_all.tracker_id)
     35 

/tmp/ipykernel_32267/3054824891.py in concat_detections(detections_list)
     36         tracker_id = np.hstack(tracker_id) if all_not_none(tracker_id) else None
     37 
---> 38         return Detections(
     39             xyxy=xyxy,
     40             mask=mask,

~/anaconda3/envs/cvc_env/lib/python3.9/site-packages/supervision/detection/core.py in __init__(self, xyxy, mask, confidence, class_id, tracker_id)

~/anaconda3/envs/cvc_env/lib/python3.9/site-packages/supervision/detection/core.py in __post_init__(self)
     72         _validate_class_id(class_id=self.class_id, n=n)
     73         _validate_confidence(confidence=self.confidence, n=n)
---> 74         _validate_tracker_id(tracker_id=self.tracker_id, n=n)
     75 
     76     def __len__(self):

~/anaconda3/envs/cvc_env/lib/python3.9/site-packages/supervision/detection/core.py in _validate_tracker_id(tracker_id, n)
     45     )
     46     if not is_valid:
---> 47         raise ValueError("tracker_id must be None or 1d np.ndarray with (n,) shape")
     48 
     49 

ValueError: tracker_id must be None or 1d np.ndarray with (n,) shape

May 09 '23 14:05 zburq

@zburq Would it be possible that you to create a reproducible example resulting in that ValueError: tracker_id must be None or 1d np.ndarray with (n,) shape error? I would like to understand what is happening here. I think it is a bug.

May 09 '23 16:05 SkalskiP

Hi @SkalskiP - apologies, I got preoccupied with another aspect of the project. I made a fix to my local copy of supervision and got it working. The bug is in Detections.empty():

Here's a working example:

det = Detections(xyxy=np.array([[146.8, 69.878, 198.65, 107.57],
                                [652.06, 81.007, 710.15, 122.01]], dtype=np.float32), 
                 mask=None, 
                 confidence=np.array([0.83976, 0.82391], dtype=np.float32), 
                 class_id=np.array([2, 2]), 
                 tracker_id=np.array([1, 2]))
           
e = Detections.empty()

e prints out like this:

Detections(xyxy=array([], shape=(0, 4), dtype=float32), 
           mask=None, confidence=array([], dtype=float32), 
           class_id=array([], dtype=int64), 
           tracker_id=None)

Detections.merge([e, det]) returns

Detections(xyxy=array([[146.8, 69.878, 198.65, 107.57],
           [652.06, 81.007, 710.15, 122.01]], dtype=float32), 
           mask=None, 
           confidence=array([0.83976, 0.82391], dtype=float32), 
           class_id=array([2, 2]), 
           tracker_id=None)

tracker_id has gone missing.

But if I modify e as:

e = Detections.empty()
e.tracker_id = np.array([], dtype=np.int64)

then Detections.merge([e, det]) keeps the tracker_id:

May 15 '23 02:05 zburq

I made a temporary fix for immediate needs:

May 15 '23 02:05 zburq

Hi, @zburq 👋🏻! That's what I suspected. The problem was the interaction between Detections.empty and Detections.merge. My proposed solution would be to introduce additional arguments Detections.empty. This way, you could create an empty class according to your needs. Does it look reasonable?

    @classmethod
    def empty(
        cls,
        with_confidence: bool = False,
        with_class_id: bool = False,
        with_tracker_id: bool = False,
        with_mask_wh: Optional[Tuple[int, int]] = None
    ) -> Detections:
        xyxy = np.empty((0, 4), dtype=np.float32)
        confidence = np.array([], dtype=np.float32) if with_confidence else None
        class_id = np.array([], dtype=int) if with_class_id else None
        tracker_id = np.array([], dtype=int) if with_tracker_id else None

        mask = None
        if with_mask_wh:
            w, h = with_mask_wh
            mask = np.empty((0, w, h), dtype=bool)

        return cls(
            xyxy=xyxy,
            confidence=confidence,
            class_id=class_id,
            tracker_id=tracker_id,
            mask=mask
        )

May 15 '23 11:05 SkalskiP

@SkalskiP :pray:

Yes - this works too. But feels slightly counterintuitive to me. Ideally, I'd expect empty to conform to any non-empty thing that merges with it, no?

The reason this would be desirable for me is that sometimes I want to extend Detections and add some more attributes on the fly. For example, if I call two models simultaneously (champion vs challenger), and want to store two class_id and confidence values for each box, (eg. YOLO_class_id & CLIP_class_id etc).

For now, I'm creating two seperate Detection objects, YOLO_det = sv.Detections() and CLIP_det = sv.Detections(). I'm running YOLO V8 on a video, storing results in YOLO_det, then calling CLIP on the cropped bounding boxes, and storing CLIP's output in aCLIP_det.

But would still like to add frame_id. This is so that I can save the bounding boxes and classes in text files like this: YOLO export , and then open it in CVAT and have someone manually check/correct the model's work (I'm starting with an unlabelled video).

Again, this is a patch based on my immediate needs. For optimal general design, I trust your judgement.

:vulcan_salute:

May 15 '23 12:05 zburq

... which reminds me, it'd be nice to have an export method for Detections. Particularly for merged Detections for a video, I'd like to export the detection data to a text file in the usual YOLO format. I have an "unauthorised" one for my own needs, but I'm sure it'd be useful for things like model-assisted labelling etc.

May 15 '23 12:05 zburq

... which reminds me, it'd be nice to have an export method for Detections. Particularly for merged Detections for a video, I'd like to export the detection data to a text file in the usual YOLO format. I have an "unauthorised" one for my own needs, but I'm sure it'd be useful for things like model-assisted labelling etc.

That one is definitely something I'd like to add. Could you create a separate issue (feature request) for that? 🙏🏻

May 15 '23 12:05 SkalskiP

Yes - this works too. But feels slightly counterintuitive to me. Ideally, I'd expect empty to conform to any non-empty thing that merges with it, no?

First of all. In my mind, there is a difference between empty detection - where we expect, for example, masks, but we didn't find any and cases where we don't expect masks at all. But I know that complicates things... The alternative is that Detections.merge and other methods that take Detections as an argument handle that internally. That will be harder to implement but easier for users.

As for extending Detections with your own fields, you can do it like with any other Python class, but I personally hate inheritance. And I think creating separate Detections objects is a cleaner solution.

May 15 '23 12:05 SkalskiP

Makes sense. Thank you.

May 15 '23 12:05 zburq