MouseSIS: Space-Time Instance Segmentation of Mice

This is the official repository for MouseSIS: A Frames-and-Events Dataset for Space-Time Instance Segmentation of Mice, accepted at the Workshop on Neuromorphic Vision in conjunction with ECCV 2024 by Friedhelm Hamann, Hanxiong Li, Paul Mieske, Lars Lewejohann and Guillermo Gallego.

👀 This dataset the base for the SIS Challenge hosted in conjunction with the CVPR 2025 Workshop on Event-based Vision.

🏆 View the complete challenge results and winning teams here!

MouseSIS Visualization

Key Features

Space-time instance segmentation dataset focused on mice tracking
Combined frames and event data from neuromorphic vision sensor
33 sequences (~20 seconds each, ~600 frames per sequence)
YouTubeVIS-style annotations
Baseline implementation and evaluation metrics included

Timeline

SIS challenge results (June 2025): We released the results of the SIS challenge and presented it at the CVPR'25 Workshop on Event-based Vision.
v1.0.0 (Current, February 2024): Major refactoring and updates, including improved documentation.
v0.1.0 (September 2023): Initial release with basic functionality and dataset.

Quickstart
Installation
Data Preparation
- Data
- Pretrained Weights
- Preprocess Events
Evaluation
- Quickstart Evaluation
- Evaluation on Full Validation Set
- Evaluation on Test Set
Acknowledgements
Citation
Additional Resources
License

Quickstart

If you want to work with the dataset the quickest way to access the data and get an idea of it's structure is downloading one sequence and the annotations of the according split and visualizing the data, e.g. seq12.h5:

python scripts/visualize_events_frames_and_masks.py --h5_path data/MouseSIS/top/val/seq12.h5 --annotation_path data/MouseSIS/val_annotations.json

This requires h5py, numpy, Pillow, tqdm. The full dataset structure is explained here.

Installation

Clone the repository:

git clone [email protected]:tub-rip/MouseSIS.git
cd MouseSIS

Set up the environment:

conda create --name MouseSIS python=3.8
conda activate MouseSIS

Install PyTorch (choose a command compatible with your CUDA version from the PyTorch website), e.g.:
```
conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia
```
Install other dependencies:
```
pip install -r requirements.txt
```

Data Preparation

Data

Create a folder for the original data

cd <project-root>
mkdir -p data/MouseSIS

Download the data and annotation and save it in <project-root>/data/MouseSIS. You do not necessarily need to download the whole dataset, e.g. you can only download the sequences needed for the sequences you want to evaluate on. The data/MouseSIS folder should be organized as follows:

data/MouseSIS
│
├── top/
│   ├── train
│   │   ├── seq_02.hdf5
│   │   ├── seq_05.hdf5
│   │   ├── ...
│   │   └── seq_33.hdf5
|   ├── val
│   │   ├── seq_03.hdf5
│   │   ├── seq_04.hdf5
│   │   ├── ...
│   │   └── seq_25.hdf5
│   └── test
│       ├── seq_01.hdf5
│       ├── seq_07.hdf5
│       ├── ...
│       └── seq_32.hdf5
├── dataset_info.csv
├── val_annotations.json
└── train_annotations.json

top/: This directory contains the frame and event data for the Mouse dataset captured from top view, stored as 33 individual .hdf5 files, each containing approximately 20 seconds of data (around 600 frames), along with temporally aligned events.
dataset_info.csv: This CSV file contains metadata for each sequence, such as recording dates, providing additional context and details about the dataset.
<split>_annotations.json: The annotation file of top view for the respective splits follows a structure similar to MSCOCO's format in JSON, with some modifications. Note that the test annotations are not publicly available. The definition of json files is:

{
    "info": {
        "description": "string",     // Dataset description
        "version": "string",         // Version identifier
        "date_created": "string"     // Creation timestamp
    },
    "videos": [
        {
            "id": "string",          // Video identifier (range: "01" to "33")
            "width": integer,        // Frame width in pixels (1280)
            "height": integer,       // Frame height in pixels (720)
            "length": integer        // Total number of frames
        }
    ],
    "annotations": [
        {
            "id": integer,           // Unique instance identifier
            "video_id": "string",    // Reference to parent video
            "category_id": integer,  // Object category (1 = mouse)
            "segmentations": [
                {
                    "size": [height: integer, width: integer],  // Mask dimensions
                    "counts": "string"                          // RLE-encoded segmentation mask
                }
            ],
            "areas": [float],        // Object area in pixels
            "bboxes": [              // Bounding box coordinates
                [x_min: float, y_min: float, width: float, height: float]
            ],
            "iscrowd": integer      // Crowd annotation flag (0 or 1)
        }
    ],
    "categories": [
        {
            "id": integer,          // Category identifier
            "name": "string",       // Category name
            "supercategory": "string" // Parent category
        }
    ]
}

Pretrained Weights

Download the model weights:

cd <project-root>
mkdir models
# Download yolo_e2vid.pt, yolo_frame.pt, and XMem.pth from the provided link
# and place them in the models directory

Afterwards, the models folder should be organized as follows:

models
├── XMem.pth
├── yolo_e2vid.pt
└── yolo_frame.pt

Preprocess Events

This preprocessing step is required only when evaluating the ModelMixSort method from the paper. It relies on e2vid images reconstructed at the grayscale image timesteps.

python scripts/preprocess_events_to_e2vid_images.py --data_root data/MouseSIS

Evaluation

After downloading the data and model weights, proceed with evaluation. First run inference, e.g. our provided inference script like:

python3 scripts/inference.py --config <path-to-config-yaml>

This saves a file output/<tracker-name>/final_results.json. The file contains the predictions in this structure:

[
  {
    "video_id": int,
    "score": float,
    "instance_id": int,
    "category_id": int,
    "segmentations": [
      null | {
        "size": [int, int],
        "counts": "RLE encoded string"
      },
      ...
    ],
  },
  ...
]

Then run the evaluation script like this:

python scripts/eval.py --TRACKERS_TO_EVAL <tracker-name> --SPLIT_TO_EVAL <split-name>

Below are specific options listed.

Quickstart Evaluation

This section describes how to run a minimal evaluation workflow on one sequence of the validation set. Only download the sequence seq_25.hdf5 from the validation set and the according annotations val_annotations.json. The resulting folder should look as follows:

data/MouseSIS
│
├── top/
|   ├── val
│   │   └── seq_25.hdf5
└── val_annotations.json

Now you can run inference as

python3 scripts/inference.py --config configs/predict/quickstart.yaml

and then evaluation as

python scripts/eval.py --TRACKERS_TO_EVAL quickstart --SPLIT_TO_EVAL val

This should return the following results

Sequence	HOTA	MOTA	IDF1
25	30.15	39.125	35.315
Avg.	30.15	39.125	35.315

Evaluation on Full Validation Set

Similar as for quickstart but download all sequences of the validation set (sequences 3, 4, 12, 25).

python3 scripts/inference.py --config configs/predict/combined_on_validation.yaml
python scripts/eval.py --TRACKERS_TO_EVAL combined_on_validation --SPLIT_TO_EVAL val

Here you should get the following results

Sequence	HOTA	MOTA	IDF1
3	54.679	72.432	60.212
4	51.717	64.942	58.36
12	39.497	66.049	45.431
25	30.15	39.125	35.315
Avg.	45.256	62.097	50.459

Evaluation on Test Set Without Sequences 1 & 7 (SIS Challenge)

In this case, download all test sequences and run

python3 scripts/inference.py --config configs/predict/sis_challenge_baseline.yaml

For evaluation you can upload the final_results.json to the challenge/benchmark page, which results in the following combined metrics:

Sequence	HOTA	MOTA	IDF1
Avg.	0.43	0.45	0.5

Please note that results vary slightly from the ones reported in the paper after updates for the challenge. Please refer to version v0.1.0 to reproduce the exact paper results.

Acknowledgements

We greatfully appreciate the following repositories and thank the authors for their excellent work:

Citation

If you find this work useful in your research, please consider citing:

@inproceedings{hamann2024mousesis,
  title={{MouseSIS}: A Frames-and-Events Dataset for Space-Time Instance Segmentation of Mice},
  author={Friedhelm Hamann and Hanxiong Li and Paul Mieske and Lars Lewejohann and Guillermo Gallego},
  booktitle={European Conference on Computer Vision Workshops (ECCVW)},
  year={2024}
}

Additional Resources

License

This project is licensed under the MIT License - see the LICENSE file for details.

MouseSIS
MouseSIS copied to clipboard

Metadata

MouseSIS: Space-Time Instance Segmentation of Mice

Key Features

Timeline

Table of Contents

Quickstart

Installation

Data Preparation

Data

Pretrained Weights

Preprocess Events

Evaluation

Quickstart Evaluation

Evaluation on Full Validation Set

Evaluation on Test Set Without Sequences 1 & 7 (SIS Challenge)

Acknowledgements

Citation

Additional Resources

License

← Metadata

Owner

Metadata

MouseSIS MouseSIS copied to clipboard

Metadata

MouseSIS: Space-Time Instance Segmentation of Mice

Key Features

Timeline

Table of Contents

Quickstart

Installation

Data Preparation

Data

Pretrained Weights

Preprocess Events

Evaluation

Quickstart Evaluation

Evaluation on Full Validation Set

Evaluation on Test Set Without Sequences 1 & 7 (SIS Challenge)

Acknowledgements

Citation

Additional Resources

License

← Metadata

Owner

Metadata

MouseSIS
MouseSIS copied to clipboard