pnnet
pnnet copied to clipboard
Conjoined Triple Deep Network for Learning Local Image Descriptors
- NEW!! Check the [[https://github.com/vbalnt/tfeat][TFeat descriptor]] from BMVC 2016.
Results should be much better than PN-Net introduced below
- PN-Net: Conjoined Triple Deep Network for Learning Local Image Descriptors
[[http://arxiv.org/abs/1601.05030][Arxiv Page]]
** Intro Code for the arxiv paper [[http://arxiv.org/pdf/1601.05030v1][PN-Net: Conjoined Triple Deep Network for Learning Local Image Descriptors]].
The network extracts feature descriptors from grayscale local patches of size =32x32=.
** Architecture #+begin_src bash nn.Sequential { [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> output] (1): cudnn.SpatialConvolution(1 -> 32, 7x7) (2): cudnn.Tanh (3): cudnn.SpatialMaxPooling(2,2,2,2) (4): cudnn.SpatialConvolution(32 -> 64, 6x6) (5): cudnn.Tanh (6): nn.View (7): nn.Linear(4096 -> 128) (8): cudnn.Tanh } #+end_src
** Implementation details For optimization details refer to the arxiv publication. Training code is now also available.
** Getting the datasets in torch format
Get the Phototour datasets in .t7 format from [[https://github.com/vbalnt/UBC-Phototour-Patches-Torch][UBC-Phototour-Patches-Torch]] and extract =liberty= =yosemite= and =notredame=
** How to run the evaluation code
Run =th eval.lua=
The script will print a series of evaluation results for the patch pairs from the =notredame= =100k= dataset e.g.
#+begin_src bash 1 4.3915 0 10.6367 1 5.6122 0 10.5520 0 10.4561 0 10.1167 0 9.8624 1 3.2972 1 2.1507 1 2.9709 1 3.4437 1 7.6362 #+end_src
The first column contains the patch pair label (0 negative, 1 positive), the second row contains the L2 distance between the two patches of the pair, based on the features extracted from the last layer of our CNN.
From this output, one can compute the ROC curve (e.g. [[https://github.com/vbalnt/cppROC][cppROC]]).
** How to train
In the =train= folder, run =th run.lua=.
When training with =1.2M= triplets on a =GTX TITAN X=, each epoch takes approximately /2mins/
Examples of the training triplets used [[./triplets.png]]
** Positive matching examples Examples of positive nearest neighbour patch matching using the pnnet descriptor in the [[http://www.robots.ox.ac.uk/~vgg/research/affine/][Oxford matching dataset]].
[[./true_positives.png]]
** Efficiency Efficiency comparison with [[https://github.com/hanxf/matchnet][MatchNet]] and [[https://github.com/szagoruyko/cvpr15deepcompare][deepcompare]], both from CVPR 2015. For more results refer to the arxiv paper.
[[./efficiency.png]]