HerdNet Running train.py fails because species is not part of demo dataset

Hi, first congrats to the two papers "Will artificial intelligence revolutionize aerial surveys? A first large-scale semi-automated survey of African wildlife using oblique imagery and deep learning" and "From crowd to herd counting: How to precisely detect and count African mammals using aerial imagery and deep learning?". I am currently involved in a project trying to count marine Iguanas in the Galapagos islands and try to work through the current research. I wanted to get your code to work and ran into a couple of issues and fixed some of them. First the url of the pretrained dla model is not accessible anymore. I found a replacement which works fine: https://github.com/cwinkelmann/HerdNet/blob/initial_run/animaloc/models/dla.py#L35

Out of optimism I wanted to start with python 3.10 and had to change the dependencies, which are https://github.com/cwinkelmann/HerdNet/blob/initial_run/requirements_310_linux.txt

With that I ran the demo notebook which finished training fine, but inference didn’t succeed because of the changes in the config, which refer to this https://github.com/cwinkelmann/HerdNet/blame/296d2e14dece507431edbf9ed5c64166bbc9311d/configs/train/herdnet.yaml#L45 . The infer scripts wants to read the classes from the checkpoint which are not persisted in the demo notebook, only the train script ( https://github.com/cwinkelmann/HerdNet/blob/296d2e14dece507431edbf9ed5c64166bbc9311d/tools/infer.py#L80 ) but I can’t run the train because of issues with the “labels”/”species” column. It occurred to me the demo dataset doesn’t match the code, because species isn’t part of the dataset and labels are already in there.

I have some quickfixing ideas in my head (changing the column to species obviously doesn’t work, it fails here: https://github.com/cwinkelmann/HerdNet/blob/296d2e14dece507431edbf9ed5c64166bbc9311d/animaloc/data/annotations.py#L268 ). Do you have good solution in mind? It seems there are 6 classes, which include at least elephants for class id 6 but I am not sure about the others.

Best Christian

Aug 15 '24 17:08 cwinkelmann

Hi @cwinkelmann,

Interesting project, hope HerdNet can help! Thanks for reporting this issue. I should definitively update the demo notebook, have just put it in my to-do list!

If you want to run the train.py tool, you need to have the 'species' (name of the species, str) and 'labels' (corresponding id, int) columns. Once your model has completed training, a match (see example below) will be automatically stored in your PTH file.

Here is the matching dict for the demo dataset:

class_dict = {
   1: "topi",
   2: "buffalo",
   3: "kob",
   4: "warthog",
   5: "waterbuck",
   6: "elephant"
   }

If you add the species column accordingly to the demo's CSV files, it should work!

Let me know!

Best,

Alexandre

Aug 18 '24 14:08 Alexandre-Delplanque

Hi @Alexandre-Delplanque, thank you for the answer. I managed to run the tools/train.py script in the end by how you described it. I add the column "species" by mapping the already existing label ids to the dictionary ( https://github.com/cwinkelmann/HerdNet/blob/2db91c304cd34e1430f2ef9b6c52959925b00aa6/notebooks/run.py#L15 ). Some more changed where necessary, like setting the wandb entity to "null"

Unfortunately the persisting of "classes", "mean" and "std" as implemented here (https://github.com/cwinkelmann/HerdNet/blob/2db91c304cd34e1430f2ef9b6c52959925b00aa6/tools/train.py#L399) was not successful, so I quickfixed it again by hardcoding it:
https://github.com/cwinkelmann/HerdNet/blob/2db91c304cd34e1430f2ef9b6c52959925b00aa6/tools/infer.py#L84

I will double check if I did something wrong there. I will first try running the training with my iguanas.

When I am done with that I can have a look at updating the demo notebook and give you a pull request in the near future.

Regards Christian

Aug 19 '24 11:08 cwinkelmann

Hi @cwinkelmann,

Good news! Note that the wandb entity should be set to your Weights & Biases' username.

That's odd, have you completed the training session launched using the train.py tool? At the end of training, 'mean', 'std' and 'classes' should have been updated in the .pth file (L.394-400).

If you want to update it manually, you might use the following code snippet, which already appears in the README (see here):

import torch

pth_file = torch.load('path/to/the/file.pth')
pth_file['classes'] = {1:'species_1', 2:'species_2', ...}
pth_file['mean'] = [0.485, 0.456, 0.406]
pth_file['std'] = [0.229, 0.224, 0.225] 
torch.save(pth_file, 'path/to/the/file.pth')

Have a look at the short doc for writing / updating the configs files!

Hope this helps!

All the best,

Alexandre

Aug 23 '24 16:08 Alexandre-Delplanque

Hi @cwinkelmann,

Did you familiarise yourself with the code?

Is it still an issue?

Best,

Alexandre

Sep 24 '24 14:09 Alexandre-Delplanque

Hi @Alexandre-Delplanque, First of all congrats for your defense of your phd thesis. It is a joy to read and your work is big pillar in mine. I spent a lot of time with deepforest which struggles with false positives.

Yes I am good for now, thank you. I haven't managed to start a little refactoring yet. The aforementioned persisting of std, mean still failed because those are supposed to be fetched from the transformations Normalize, but I added the some after Normalise which made the retrieving the values impossible.

Best Christian

Sep 25 '24 09:09 cwinkelmann