HerdNet icon indicating copy to clipboard operation
HerdNet copied to clipboard

Clarification on correct setup for training HerdNet from DLA backbone (down_ratio, FIDT, Stitcher)

Open jaimevera1107 opened this issue 3 months ago • 1 comments

Hi Alexandre,

First of all, thank you for releasing animaloc and the HerdNet implementation.

I’m currently working on replicating the results from the paper and wanted to clarify the correct setup for training HerdNet starting from a DLA backbone (the third use case described in the repo).

When initializing the model with pretrained=True, the down_ratio is typically set to 2, but DLA backbones (e.g. dla34) internally operate at a 1/16 spatial reduction.

When I train HerdNet using DLA, everything runs fine, but during evaluation (with HerdNetStitcher), I get a tensor size mismatch between the heatmap and clsmap:

RuntimeError: Sizes of tensors must match except in dimension 1.
Expected size 32 but got size 256 for tensor number 1 in the list.

This suggests that the density map (FIDT) and classification map (PointsToMask) are being generated at different scales.

  • What is the correct configuration (parameters, down_ratio, etc.) for training HerdNet from a DLA backbone, as done in the original paper?
  • What down_ratio should be used for FIDT and PointsToMask inside the dataset’s end_transforms?
  • During evaluation, should HerdNetStitcher be initialized with up=True and reduction='mean' when using DLA?

Is there an example or reference configuration that exactly reproduces the paper setup (training from DLA, not from scratch)?

Thanks a lot for your time and for maintaining this repository -it’s been very helpful for understanding density-based animal detection!

jaimevera1107 avatar Oct 28 '25 16:10 jaimevera1107

Hi @jaimevera1107,

Thank you for opening this issue! The error you report is probably due to the fact that I made a small mistake in my code at the time. I had corrected the error of the classification branch scaling when a one uses a value other than down_ratio=2. This commit is in the “features” code branch, see here. I will put it online soon (early 2026) after integrating other code corrections.

Here are my answers to your questions:

What is the correct configuration (parameters, down_ratio, etc.) for training HerdNet from a DLA backbone, as done in the original paper? Is there an example or reference configuration that exactly reproduces the paper setup (training from DLA, not from scratch)?

You can find these values in the HerdNet configuration file, accessible here.

What down_ratio should be used for FIDT and PointsToMask inside the dataset’s end_transforms?

down_ratio=2 for FIDT and down_ratio=32 for PointsToMask.

During evaluation, should HerdNetStitcher be initialized with up=True and reduction='mean' when using DLA?

Yes, you can use up=True. If you set it to False, remember that your coordinates will be divided by the specified FIDT's down_ratio value. And yes for the reduction!

Don't hesitate to ask if you need any further assistance!

Best,

Alexandre

Alexandre-Delplanque avatar Nov 03 '25 11:11 Alexandre-Delplanque

Hi @jaimevera1107 ,

Is this issue still relevant?

Best,

Alexandre

Alexandre-Delplanque avatar Dec 09 '25 08:12 Alexandre-Delplanque