MouseSIS
MouseSIS copied to clipboard
The official implementation of "MouseSIS: Space-Time Instance Segmentation of Mice" (ECCVW 2024)
MouseSIS: Space-Time Instance Segmentation of Mice
This is the official repository for MouseSIS: A Frames-and-Events Dataset for Space-Time Instance Segmentation of Mice, accepted at the Workshop on Neuromorphic Vision in conjunction with ECCV 2024 by Friedhelm Hamann, Hanxiong Li, Paul Mieske, Lars Lewejohann and Guillermo Gallego.
👀 This dataset the base for the SIS Challenge hosted in conjunction with the CVPR 2025 Workshop on Event-based Vision.
🏆 View the complete challenge results and winning teams here!
Key Features
- Space-time instance segmentation dataset focused on mice tracking
- Combined frames and event data from neuromorphic vision sensor
- 33 sequences (~20 seconds each, ~600 frames per sequence)
- YouTubeVIS-style annotations
- Baseline implementation and evaluation metrics included
Timeline
-
SIS challenge results (June 2025): We released the results of the SIS challenge and presented it at the CVPR'25 Workshop on Event-based Vision.
-
v1.0.0 (Current, February 2024): Major refactoring and updates, including improved documentation.
-
v0.1.0 (September 2023): Initial release with basic functionality and dataset.
Table of Contents
- Quickstart
- Installation
- Data Preparation
- Data
- Pretrained Weights
- Preprocess Events
- Evaluation
- Quickstart Evaluation
- Evaluation on Full Validation Set
- Evaluation on Test Set
- Acknowledgements
- Citation
- Additional Resources
- License
Quickstart
If you want to work with the dataset the quickest way to access the data and get an idea of it's structure is downloading one sequence and the annotations of the according split and visualizing the data, e.g. seq12.h5:
python scripts/visualize_events_frames_and_masks.py --h5_path data/MouseSIS/top/val/seq12.h5 --annotation_path data/MouseSIS/val_annotations.json
This requires h5py, numpy, Pillow, tqdm. The full dataset structure is explained here.
Installation
-
Clone the repository:
git clone [email protected]:tub-rip/MouseSIS.git cd MouseSIS -
Set up the environment:
conda create --name MouseSIS python=3.8 conda activate MouseSIS -
Install PyTorch (choose a command compatible with your CUDA version from the PyTorch website), e.g.:
conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia -
Install other dependencies:
pip install -r requirements.txt
Data Preparation
Data
-
Create a folder for the original data
cd <project-root> mkdir -p data/MouseSIS -
Download the data and annotation and save it in
<project-root>/data/MouseSIS. You do not necessarily need to download the whole dataset, e.g. you can only download the sequences needed for the sequences you want to evaluate on. Thedata/MouseSISfolder should be organized as follows:data/MouseSIS │ ├── top/ │ ├── train │ │ ├── seq_02.hdf5 │ │ ├── seq_05.hdf5 │ │ ├── ... │ │ └── seq_33.hdf5 | ├── val │ │ ├── seq_03.hdf5 │ │ ├── seq_04.hdf5 │ │ ├── ... │ │ └── seq_25.hdf5 │ └── test │ ├── seq_01.hdf5 │ ├── seq_07.hdf5 │ ├── ... │ └── seq_32.hdf5 ├── dataset_info.csv ├── val_annotations.json └── train_annotations.jsontop/: This directory contains the frame and event data for the Mouse dataset captured from top view, stored as 33 individual.hdf5files, each containing approximately 20 seconds of data (around 600 frames), along with temporally aligned events.dataset_info.csv: This CSV file contains metadata for each sequence, such as recording dates, providing additional context and details about the dataset.<split>_annotations.json: The annotation file of top view for the respective splits follows a structure similar to MSCOCO's format in JSON, with some modifications. Note that the test annotations are not publicly available. The definition of json files is:
{ "info": { "description": "string", // Dataset description "version": "string", // Version identifier "date_created": "string" // Creation timestamp }, "videos": [ { "id": "string", // Video identifier (range: "01" to "33") "width": integer, // Frame width in pixels (1280) "height": integer, // Frame height in pixels (720) "length": integer // Total number of frames } ], "annotations": [ { "id": integer, // Unique instance identifier "video_id": "string", // Reference to parent video "category_id": integer, // Object category (1 = mouse) "segmentations": [ { "size": [height: integer, width: integer], // Mask dimensions "counts": "string" // RLE-encoded segmentation mask } ], "areas": [float], // Object area in pixels "bboxes": [ // Bounding box coordinates [x_min: float, y_min: float, width: float, height: float] ], "iscrowd": integer // Crowd annotation flag (0 or 1) } ], "categories": [ { "id": integer, // Category identifier "name": "string", // Category name "supercategory": "string" // Parent category } ] }
Pretrained Weights
Download the model weights:
cd <project-root>
mkdir models
# Download yolo_e2vid.pt, yolo_frame.pt, and XMem.pth from the provided link
# and place them in the models directory
Afterwards, the models folder should be organized as follows:
models
├── XMem.pth
├── yolo_e2vid.pt
└── yolo_frame.pt
Preprocess Events
This preprocessing step is required only when evaluating the ModelMixSort method from the paper. It relies on e2vid images reconstructed at the grayscale image timesteps.
python scripts/preprocess_events_to_e2vid_images.py --data_root data/MouseSIS
Evaluation
After downloading the data and model weights, proceed with evaluation. First run inference, e.g. our provided inference script like:
python3 scripts/inference.py --config <path-to-config-yaml>
This saves a file output/<tracker-name>/final_results.json. The file contains the predictions in this structure:
[
{
"video_id": int,
"score": float,
"instance_id": int,
"category_id": int,
"segmentations": [
null | {
"size": [int, int],
"counts": "RLE encoded string"
},
...
],
},
...
]
Then run the evaluation script like this:
python scripts/eval.py --TRACKERS_TO_EVAL <tracker-name> --SPLIT_TO_EVAL <split-name>
Below are specific options listed.
Quickstart Evaluation
This section describes how to run a minimal evaluation workflow on one sequence of the validation set. Only download the sequence seq_25.hdf5 from the validation set and the according annotations val_annotations.json. The resulting folder should look as follows:
data/MouseSIS
│
├── top/
| ├── val
│ │ └── seq_25.hdf5
└── val_annotations.json
Now you can run inference as
python3 scripts/inference.py --config configs/predict/quickstart.yaml
and then evaluation as
python scripts/eval.py --TRACKERS_TO_EVAL quickstart --SPLIT_TO_EVAL val
This should return the following results
| Sequence | HOTA | MOTA | IDF1 |
|---|---|---|---|
| 25 | 30.15 | 39.125 | 35.315 |
| Avg. | 30.15 | 39.125 | 35.315 |
Evaluation on Full Validation Set
Similar as for quickstart but download all sequences of the validation set (sequences 3, 4, 12, 25).
python3 scripts/inference.py --config configs/predict/combined_on_validation.yaml
python scripts/eval.py --TRACKERS_TO_EVAL combined_on_validation --SPLIT_TO_EVAL val
Here you should get the following results
| Sequence | HOTA | MOTA | IDF1 |
|---|---|---|---|
| 3 | 54.679 | 72.432 | 60.212 |
| 4 | 51.717 | 64.942 | 58.36 |
| 12 | 39.497 | 66.049 | 45.431 |
| 25 | 30.15 | 39.125 | 35.315 |
| Avg. | 45.256 | 62.097 | 50.459 |
Evaluation on Test Set Without Sequences 1 & 7 (SIS Challenge)
In this case, download all test sequences and run
python3 scripts/inference.py --config configs/predict/sis_challenge_baseline.yaml
For evaluation you can upload the final_results.json to the challenge/benchmark page, which results in the following combined metrics:
| Sequence | HOTA | MOTA | IDF1 |
|---|---|---|---|
| Avg. | 0.43 | 0.45 | 0.5 |
Please note that results vary slightly from the ones reported in the paper after updates for the challenge. Please refer to version v0.1.0 to reproduce the exact paper results.
Acknowledgements
We greatfully appreciate the following repositories and thank the authors for their excellent work:
Citation
If you find this work useful in your research, please consider citing:
@inproceedings{hamann2024mousesis,
title={{MouseSIS}: A Frames-and-Events Dataset for Space-Time Instance Segmentation of Mice},
author={Friedhelm Hamann and Hanxiong Li and Paul Mieske and Lars Lewejohann and Guillermo Gallego},
booktitle={European Conference on Computer Vision Workshops (ECCVW)},
year={2024}
}
Additional Resources
- Recording Software (CoCapture)
- Secrets of Event-Based Optical Flow (TPAMI 2024)
- EVILIP: Event-based Image Reconstruction as a Linear Inverse Problem (TPAMI 2022)
- Research page (TU Berlin, RIP lab)
- Science Of Intelligence Homepage
- Course at TU Berlin
- Survey paper
- List of Event-based Vision Resources
License
This project is licensed under the MIT License - see the LICENSE file for details.