Understanding ByteTrack in yolov11
Hello! I see that bytetrack is being implemented in the ultralytics library and i wish to get a better understanding of it
In the ultralytics library, i see that there is a bytetrack.yaml file which contains
# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license
# Default Ultralytics settings for ByteTrack tracker when using mode="track"
# For documentation and examples see https://docs.ultralytics.com/modes/track/
# For ByteTrack source code see https://github.com/ifzhang/ByteTrack
tracker_type: bytetrack # tracker type, ['botsort', 'bytetrack']
track_high_thresh: 0.25 # threshold for the first association
track_low_thresh: 0.1 # threshold for the second association
new_track_thresh: 0.25 # threshold for init new track if the detection does not match any tracks
track_buffer: 30 # buffer to calculate the time when to remove tracks
match_thresh: 0.8 # threshold for matching tracks
fuse_score: True # Whether to fuse confidence scores with the iou distances before matching
# min_box_area: 10 # threshold for min box areas(for tracker evaluation, not used for now)
May i know What do the values track_high_thresh and track_low_thresh and new_track_thresh and match_thresh mean?
and how are these values computed?
I've been digging into the C++ implementation, which uses similar thresholds.
From what I understand, track_low_thresh and track_high_thresh
refer to detection scores, that is, how confident the system is
about a detection. These thresholds are used to split detections
into two groups: one with low confidence and one with high
confidence. The high-confidence detections are matched first
with existing tracks.
After this initial high-score matching phase, any remaining unmatched tracks are then matched against the lower-confidence detections. This gives those detections a second chance to be associated with existing tracks.
At the end of the matching process, the system checks if there are still any high-confidence detections left that didn’t match with any track. If there are, it assumes these might be new objects and creates new tracks for them.
Now, the match_thresh value is a bit different—it's not related
to detection confidence. Instead, it's a cost threshold used
during the linear assignment step, which is solved using the
Jonker-Volgenant algorithm. This part relies on IoU (Intersection
over Union) scores to determine how well a detection overlaps
with a track.
Here’s the catch: normally, an IoU of 1.0 means perfect overlap,
and 0.0 means no overlap. But the assignment algorithm looks for
the lowest cost matches, where 0.0 means no cost and higher
values mean worse matches. So to use IoU directly as a cost,
they invert the values using 1.0 - iou_score.
So, for example, if match_thresh is set to 0.8, that means any
pair with less than 20% overlap will not be
matched. But even a small overlap, like 21%, would still be
considered a valid match.