Implements Feature Pyramid Network
For https://github.com/mapbox/robosat/issues/60.
This changeset implements a Feature Pyramid Network (FPN) on top of a (potentially pre-trained) ResNet.
- Feature Pyramid Networks for Object Detection
- A Unified Architecture for Instance and Semantic Segmentation
The implementation tries to follow these two resources carefully.

from http://presentations.cocodataset.org/COCO17-Stuff-FAIR.pdf
Here is the overall design for the full architecture:
- The left-most bottom-up pathway is the ResNet with its layers. Every time it is downsampling the spatial resolution by two it is doubling the number of feature maps.
- The lateral pathways are using
1x1convolutions to transform the ResNet feature maps (of sizes 256, 512, 1024, 2048) into a fixed number of feature maps (configurable, 256 by default). - The top-down pathways are then upsampling the feature maps by a factor of two again to get the spatial resolutions in sync for merging (adding) the lateral and top-down feature maps.
- For segmentation we then add
3x3convolutions on top of the FPN feature maps, concatenate their outputs, and add a final convolution with number of classes in its output. - We need to upsample the final output by a factor of four since we are starting with the ResNet features which are already downsampled in resolution by a factor of four.
Got this error:
./rs weights --dataset config/dataset-building.toml
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/data/robosat/robosat/tools/__main__.py", line 5, in <module>
from robosat.tools import (
File "/data/robosat/robosat/tools/predict.py", line 17, in <module>
from robosat.fpn import FPNSegmenation
ImportError: cannot import name 'FPNSegmenation'
Typo? There are 4 lines having FPNSegmenation in this PR.
Ah, that's what you get from a quick Friday evening refactor: silly mistakes :sweat_smile:
I just fixed it, give it another go! Sorry for the noise here.
Per https://github.com/mapbox/robosat/pull/104#pullrequestreview-144960551
- [x] Assert image resolution has to be divisible by 32 for resnet in fpn
- [x] Rebase branch with
master
Next actions here
- [ ] benchmark training
- [ ] benchmark prediction
- [ ] look into spatial dropout
- [ ] change classifier from a simple conv to e.g.
conv1x1, bn, relu, dropout2d, conv1x1 - [ ] format with black
- [ ] merge into master
- [ ] tag new major release
By now there are pre-trained resnet50-fpns in torchvision. If we want to stay with semantic segmentation we should try them and later potentially extend to instance segmentation on top.