Understanding ByteTrack in yolov11

Open ykn96 opened this issue 10 months ago • 1 comments

Hello! I see that bytetrack is being implemented in the ultralytics library and i wish to get a better understanding of it In the ultralytics library, i see that there is a bytetrack.yaml file which contains

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license

# Default Ultralytics settings for ByteTrack tracker when using mode="track"
# For documentation and examples see https://docs.ultralytics.com/modes/track/
# For ByteTrack source code see https://github.com/ifzhang/ByteTrack

tracker_type: bytetrack # tracker type, ['botsort', 'bytetrack']
track_high_thresh: 0.25 # threshold for the first association
track_low_thresh: 0.1 # threshold for the second association
new_track_thresh: 0.25 # threshold for init new track if the detection does not match any tracks
track_buffer: 30 # buffer to calculate the time when to remove tracks
match_thresh: 0.8 # threshold for matching tracks
fuse_score: True # Whether to fuse confidence scores with the iou distances before matching
# min_box_area: 10  # threshold for min box areas(for tracker evaluation, not used for now)

May i know What do the values track_high_thresh and track_low_thresh and new_track_thresh and match_thresh mean? and how are these values computed?

Mar 28 '25 03:03 ykn96

I've been digging into the C++ implementation, which uses similar thresholds.

From what I understand, track_low_thresh and track_high_thresh refer to detection scores, that is, how confident the system is about a detection. These thresholds are used to split detections into two groups: one with low confidence and one with high confidence. The high-confidence detections are matched first with existing tracks.

After this initial high-score matching phase, any remaining unmatched tracks are then matched against the lower-confidence detections. This gives those detections a second chance to be associated with existing tracks.

At the end of the matching process, the system checks if there are still any high-confidence detections left that didn’t match with any track. If there are, it assumes these might be new objects and creates new tracks for them.

Now, the match_thresh value is a bit different—it's not related to detection confidence. Instead, it's a cost threshold used during the linear assignment step, which is solved using the Jonker-Volgenant algorithm. This part relies on IoU (Intersection over Union) scores to determine how well a detection overlaps with a track.

Here’s the catch: normally, an IoU of 1.0 means perfect overlap, and 0.0 means no overlap. But the assignment algorithm looks for the lowest cost matches, where 0.0 means no cost and higher values mean worse matches. So to use IoU directly as a cost, they invert the values using 1.0 - iou_score.

So, for example, if match_thresh is set to 0.8, that means any pair with less than 20% overlap will not be matched. But even a small overlap, like 21%, would still be considered a valid match.

Apr 11 '25 08:04 roxlu