What does this PR do?

This PR adds ProPainter, a Video Inpainting model with 5.4k stars and 635 forks repo. It fixes #26360 and resolve stale PR #26391 for the above issue from complete scratch to build on with transformers standard.

Before submitting

[ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[x] Did you read the contributor guideline, Pull Request section?
[x] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
[x] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
[x] Did you write any new necessary tests?

Who can review?

@amyeroberts @ArthurZucker @NielsRogge (?) @rafaelpadilla(as he was the initial reviewer on the stale PR)

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

The PR is more than ready for first pass of review!!!

TODO(will be done in a fly :)):

[x] Fix all common test failures
[x] Update weights conversion scripts with the working one on local machine
[x] Review batching nits one more time in the applicable files
[x] Update docs in corresponding files
[x] Check for video 'outpainting' error

Results:

Here, I am attaching the GIFs for original video, original model's output for object removal through video inpainting and the current PR' HF model's output for object removal through video inpainting:

Original video:

Original model output: original_removal

HF ported model output:

hf_removal

Example usage is provided in the doc file here

Aug 30 '24 14:08 RUFFY-369

(Sorry about the failures on the ci should be fixed now)

Aug 30 '24 23:08 ArthurZucker

Thank you @RUFFY-369 @ArthurZucker @amyeroberts @NielsRogge all for your hard work in bringing our ProPainter model to the Hugging Face Transformers repository! I really appreciate your efforts to make it more accessible to the community. Cheers!🎉🎉🎉

Sep 02 '24 05:09 sczhou

cc @amyeroberts @NielsRogge , I have addressed all the suggested changes that were mentioned. Please check them out. ~~Only thing to add is the checkpoints conversion file and that will be also soon be added.~~ Update:This has also been done

Sep 06 '24 17:09 RUFFY-369

Update: All the files I've modified are ready for review.

Hi @amyeroberts and @NielsRogge ! When you have some time, could you please take another look at this PR? I've resolved your previous remarks and left the ones where I had questions open (some are updated & closed from my side).

Thanks in advance!

Sep 09 '24 15:09 RUFFY-369

@RUFFY-369 I can see that there's still commits being pushed. Could you address the failing tests first, and then ping when ready for review? Let us know if there's any areas of the PR you have specific questions about or need any help with resolving the tests

Sep 10 '24 10:09 amyeroberts

@amyeroberts Thank you for your reply. Yeah I was working on solving the failing CI tests in these commits. And meanwhile, I also did some refactoring in the code by removing more of every hard coded values that remained by moving it to config file if code allows.

I tagged you in one of the review conversation in this PR that I asked regarding the VideoProcessor

I will ping you as soon as the tests are green :+1: and may ask for help if I get stuck in it

Sep 10 '24 10:09 RUFFY-369

Hi @amyeroberts , I need your help with the ~test_processorstest which is failing [here](https://circleci.com/gh/huggingface/transformers/1382859) as I think this isn't related to the PR and is related to themain` branch.~

Update: I noticed that it was fixed when I pushed my latest commits. :+1:

Sep 11 '24 14:09 RUFFY-369

Hi @amyeroberts . All the tests are green except one. I may need some advice and your help with this as this is the only one remaining. tests_torch is failing with this: worker 'gw6' crashed while running 'tests/models/propainter/test_modeling_propainter.py::ProPainterModelTest::test_attention_outputs'. And I have changed different config attributes in the test modeling file in the above commits to make it go as light as possible but it still happens.

If you could take a look at it then this remaining one test will be green and the PR will be ready for review.

Thank you :smile:

cc: @NielsRogge , @ydshieh

Sep 12 '24 09:09 RUFFY-369

still failing on the same test

Sep 12 '24 16:09 RUFFY-369

@RUFFY-369 Are you able to run the test locally?

Sep 12 '24 16:09 amyeroberts

@amyeroberts Yes, tests are running locally, all of them. And all are passed.

Sep 12 '24 16:09 RUFFY-369

Although my previous experience says it's likely OOM, but from the resource usage log it doesn't seem that case

And I have changed different config attributes in the test modeling file in the above commits to make it go as light as possible but it still happens.

I suggest you set a breakpoint, print the content of model's config and compare it to the default config's values.

I would also be helpful to save the (created) model and see its size, and/or print the model print(model)

Sep 13 '24 12:09 ydshieh

Although my previous experience says it's likely OOM, but from the resource usage log it doesn't seem that case
> And I have changed different config attributes in the test modeling file in the above commits to make it go as light as possible but it still happens.
I suggest you set a breakpoint, print the content of model's config and compare it to the default config's values.

I would also be helpful to save the (created) model and see its size, and/or print the model print(model)

@ydshieh Thank you for your reply. Initially, I skipped the test which was crashing the worker but then the next test in run also failed, you can notice that in the latest test fails. It seems to be OOM but when I also checked the analytics they didn't point in that direction.

I had already set the config attributes as low as possible in test file. And what points out it be OOM in my opinion was that all the tests are passing without any OOM in my own system.

Sep 13 '24 13:09 RUFFY-369

You can still print the config of the default and the used config in full form and let's see what are the difference.

Sep 13 '24 14:09 ydshieh

You can still print the config of the default and the used config in full form and let's see what are the difference.

@ydshieh I think with default config you mean the config which is defined in the test file and model's config is what we will get with model.config, right? For example : this and this respectively

PS If I understand correct then here they are respectively:

ProPainterConfig {
  "adversarial_weight": 0.01,
  "channels": [
    64,
    96,
    128
  ],
  "corr_levels": 4,
  "corr_radius": 4,
  "dropout": 0.0,
  "flow_weight_flow_complete_net": 0.25,
  "gan_loss": "hinge",
  "hidden_size": 512,
  "hole_weight": 1.0,
  "in_channels": [
    64,
    64,
    96
  ],
  "initializer_range": 0.02,
  "interp_mode": "nearest",
  "kernel_size": [
    7,
    7
  ],
  "kernel_size_3d": [
    1,
    3,
    3
  ],
  "kernel_size_3d_discriminator": [
    3,
    5,
    5
  ],
  "model_type": "propainter",
  "neighbor_length": 10,
  "no_dis": false,
  "norm_fn": [
    "batch",
    "group",
    "instance",
    "none"
  ],
  "num_attention_heads": 1,
  "num_channels": 128,
  "num_hidden_layers": 2,
  "num_local_frames_flow_complete_net": 8,
  "num_local_frames_propainter": 8,
  "padding": 1,
  "padding_inpaint_generator": [
    3,
    3
  ],
  "patch_size": 3,
  "perceptual_weight": 0.0,
  "pool_size": [
    4,
    4
  ],
  "raft_iter": 20,
  "ref_stride": 10,
  "stride": [
    3,
    3
  ],
  "stride_3d": [
    1,
    1,
    1
  ],
  "strides": [
    1,
    2,
    2
  ],
  "subvideo_length": 80,
  "transformers_version": "4.45.0.dev0",
  "valid_weight": 1.0,
  "window_size": [
    5,
    9
  ]
}

ProPainterConfig {
  "adversarial_weight": 0.01,
  "channels": [
    64,
    96,
    128
  ],
  "corr_levels": 4,
  "corr_radius": 4,
  "dropout": 0.0,
  "flow_weight_flow_complete_net": 0.25,
  "gan_loss": "hinge",
  "hidden_size": 512,
  "hole_weight": 1.0,
  "in_channels": [
    64,
    64,
    96
  ],
  "initializer_range": 0.02,
  "interp_mode": "nearest",
  "kernel_size": [
    7,
    7
  ],
  "kernel_size_3d": [
    1,
    3,
    3
  ],
  "kernel_size_3d_discriminator": [
    3,
    5,
    5
  ],
  "model_type": "propainter",
  "neighbor_length": 10,
  "no_dis": false,
  "norm_fn": [
    "batch",
    "group",
    "instance",
    "none"
  ],
  "num_attention_heads": 1,
  "num_channels": 128,
  "num_hidden_layers": 2,
  "num_local_frames_flow_complete_net": 8,
  "num_local_frames_propainter": 8,
  "padding": 1,
  "padding_inpaint_generator": [
    3,
    3
  ],
  "patch_size": 3,
  "perceptual_weight": 0.0,
  "pool_size": [
    4,
    4
  ],
  "raft_iter": 20,
  "ref_stride": 10,
  "stride": [
    3,
    3
  ],
  "stride_3d": [
    1,
    1,
    1
  ],
  "strides": [
    1,
    2,
    2
  ],
  "subvideo_length": 80,
  "transformers_version": "4.45.0.dev0",
  "valid_weight": 1.0,
  "window_size": [
    5,
    9
  ]
}

and they are identical

Sep 13 '24 14:09 RUFFY-369

I did CPU profiling with torch.profiler.profile context manger while calculating the outputs in this line and this was the result of it:

-----------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  
                                         aten::uniform_        46.82%        1.077s        46.82%        1.077s      29.121ms            37  
                                       cudaLaunchKernel        23.84%     548.694ms        23.84%     548.694ms      44.264us         12396  
                                      aten::convolution         0.18%       4.196ms        16.86%     387.925ms     352.019us          1102  
                                     aten::_convolution         0.44%      10.180ms        16.68%     383.729ms     348.212us          1102  
                                           aten::conv2d         0.11%       2.452ms        16.13%     371.109ms     350.765us          1058  
                                aten::cudnn_convolution         8.29%     190.862ms        15.27%     351.484ms     318.951us          1102  
                                            aten::copy_         4.51%     103.848ms         6.37%     146.647ms     152.758us           960  
                                              aten::pow         0.12%       2.851ms         3.05%      70.158ms     449.729us           156  
                                          aten::nonzero         0.02%     467.298us         2.31%      53.172ms       6.646ms             8  
                                  cudaFuncGetAttributes         2.19%      50.410ms         2.19%      50.410ms       3.151ms            16  
                                               aten::to         0.03%     776.032us         2.18%      50.269ms      74.142us           678  
                                         aten::_to_copy         0.09%       2.185ms         2.15%      49.493ms     115.908us           427  
cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFla...         1.57%      36.145ms         1.57%      36.145ms     488.446us            74  
                                  cudaStreamSynchronize         1.51%      34.656ms         1.51%      34.656ms     150.025us           231  
                                              aten::cat         0.88%      20.281ms         1.39%      31.962ms      28.873us          1107  
                                        aten::remainder         0.00%     113.288us         1.29%      29.800ms       7.450ms             4  
                               aten::linalg_vector_norm         0.09%       2.159ms         1.24%      28.519ms     250.165us           114  
                                          aten::softmax         0.01%     153.429us         1.24%      28.500ms     593.750us            48  
                                         aten::_softmax         0.03%     732.581us         1.23%      28.347ms     590.554us            48  
                                              aten::sub         0.41%       9.529ms         1.15%      26.554ms      34.219us           776  
                                              aten::mul         0.66%      15.154ms         1.14%      26.187ms      22.712us          1153  
                             torchvision::deform_conv2d         0.21%       4.721ms         1.02%      23.512ms     489.830us            48  
                                     aten::grid_sampler         0.05%       1.086ms         1.02%      23.425ms      92.226us           254  
                                              aten::div         0.42%       9.729ms         1.01%      23.292ms      31.390us           742  
                                       aten::batch_norm         0.01%     203.242us         0.90%      20.699ms     344.990us            60  
                           aten::_batch_norm_impl_index         0.01%     277.477us         0.89%      20.496ms     341.602us            60  
                                           aten::conv3d         0.01%     189.311us         0.85%      19.635ms     446.239us            44  
                                             aten::add_         0.54%      12.523ms         0.80%      18.451ms      17.116us          1078  
                                             aten::flip         0.02%     406.855us         0.77%      17.812ms     614.206us            29  
                                               aten::eq         0.01%     260.818us         0.77%      17.728ms       1.477ms            12  
                                              aten::neg         0.01%     139.777us         0.67%      15.494ms       1.937ms             8  
                                              aten::add         0.45%      10.315ms         0.63%      14.571ms      19.454us           749  
          cudaOccupancyMaxActiveBlocksPerMultiprocessor         0.61%      14.033ms         0.61%      14.033ms      57.512us           244  
                                            aten::relu_         0.04%     822.113us         0.60%      13.698ms      87.805us           156  
                                           aten::arange         0.09%       1.987ms         0.58%      13.420ms      32.893us           408  
                                    aten::instance_norm         0.01%     151.675us         0.57%      13.158ms     438.599us            30  
                                       aten::clamp_min_         0.06%       1.343ms         0.56%      12.875ms      82.535us           156  
                                          aten::reshape         0.13%       3.072ms         0.55%      12.681ms       8.387us          1512  
                                aten::native_batch_norm         0.04%     929.592us         0.54%      12.315ms     410.515us            30  
                               aten::cudnn_grid_sampler         0.29%       6.769ms         0.53%      12.197ms      50.821us           240  
                                            aten::stack         0.09%       1.973ms         0.51%      11.804ms      32.340us           365  
                                              aten::any         0.01%     169.477us         0.51%      11.746ms       1.958ms             6  
                                            aten::round         0.00%      51.830us         0.49%      11.327ms       5.664ms             2  
                                            aten::empty         0.48%      11.011ms         0.48%      11.011ms       5.533us          1990  
                                            aten::clone         0.05%       1.213ms         0.46%      10.543ms      32.441us           325  
                                              aten::sum         0.06%       1.441ms         0.45%      10.242ms     160.027us            64  
                                              aten::min         0.00%      92.426us         0.44%      10.239ms       5.119ms             2  
                                  aten::grid_sampler_2d         0.01%     184.281us         0.44%      10.142ms     724.442us            14  
                                        aten::threshold         0.00%      61.058us         0.43%       9.929ms       4.964ms             2  
                                              aten::max         0.00%      73.069us         0.41%       9.524ms       4.762ms             2  
                                           aten::addmm_         0.07%       1.546ms         0.41%       9.408ms     196.010us            48  
                                          aten::sigmoid         0.11%       2.637ms         0.40%       9.183ms      43.730us           210  
                                             aten::mean         0.02%     499.139us         0.39%       9.010ms     391.741us            23  
                                          aten::l1_loss         0.00%      48.459us         0.39%       8.907ms       1.485ms             6  
                                            aten::atan2         0.00%      58.627us         0.38%       8.689ms       4.345ms             2  
                                               aten::lt         0.02%     574.884us         0.33%       7.690ms     192.260us            40  
                                 aten::cudnn_batch_norm         0.14%       3.218ms         0.33%       7.652ms     255.079us            30  
                                      aten::leaky_relu_         0.13%       3.002ms         0.31%       7.177ms      25.542us           281  
                                             aten::relu         0.04%     920.344us         0.30%       6.848ms      33.567us           204  
                                              aten::abs         0.01%     203.165us         0.29%       6.692ms     278.846us            24  
                                             aten::view         0.28%       6.346ms         0.28%       6.346ms       1.873us          3388  
                                        aten::clamp_min         0.21%       4.835ms         0.26%       5.927ms      29.056us           204  
                                       aten::empty_like         0.05%       1.259ms         0.26%       5.913ms       9.988us           592  
                                           aten::im2col         0.07%       1.716ms         0.25%       5.829ms     126.719us            46  
                                             aten::gelu         0.00%     111.354us         0.25%       5.818ms       1.455ms             4  
                                            aten::slice         0.18%       4.206ms         0.24%       5.575ms       2.796us          1994  
                                           aten::gather         0.01%     130.534us         0.24%       5.482ms       1.371ms             4  
                                        cudaMemcpyAsync         0.23%       5.251ms         0.23%       5.251ms      15.675us           335  
                                    aten::empty_strided         0.21%       4.737ms         0.22%       5.019ms       9.159us           548  
                                       aten::as_strided         0.20%       4.667ms         0.20%       4.667ms       0.763us          6115  
                                         aten::meshgrid         0.08%       1.845ms         0.20%       4.634ms      17.963us           258  
                                        cudaEventRecord         0.19%       4.471ms         0.19%       4.471ms       1.716us          2606  
                               aten::upsample_nearest2d         0.01%     123.315us         0.19%       4.438ms       1.110ms             4  
                                              aten::pad         0.00%     113.255us         0.19%       4.429ms     170.363us            26  
                                           aten::select         0.14%       3.197ms         0.18%       4.136ms       2.631us          1572  
                                               cudaFree         0.18%       4.106ms         0.18%       4.106ms     513.260us             8  
                                             aten::tanh         0.08%       1.838ms         0.17%       4.018ms      30.436us           132  
                                            aten::where         0.03%     766.533us         0.17%       3.956ms      47.099us            84  
                                           aten::linear         0.01%     294.243us         0.17%       3.813ms     105.929us            36  
                                             aten::rsub         0.04%     906.408us         0.17%       3.802ms      23.324us           163  
                                       aten::contiguous         0.01%     152.726us         0.17%       3.800ms      32.478us           117  
                                             cudaMalloc         0.16%       3.783ms         0.16%       3.783ms     199.119us            19  
                                            aten::zeros         0.02%     379.029us         0.16%       3.779ms      26.240us           144  
                                            aten::fill_         0.08%       1.895ms         0.15%       3.549ms      10.754us           330  
                                           aten::matmul         0.06%       1.395ms         0.15%       3.422ms     142.576us            24  
                                         aten::linspace         0.08%       1.903ms         0.14%       3.312ms       5.175us           640  
                                          aten::resize_         0.11%       2.634ms         0.14%       3.222ms       5.692us           566  
                                             aten::roll         0.02%     538.465us         0.14%       3.185ms      33.179us            96  
                                            aten::zero_         0.03%     693.011us         0.13%       3.042ms      15.212us           200  
                                           aten::narrow         0.06%       1.453ms         0.13%       2.973ms       4.504us           660  
                                          aten::type_as         0.01%     150.860us         0.13%       2.937ms      31.242us            94  
                                           aten::repeat         0.03%     640.731us         0.11%       2.458ms      53.427us            46  
                                            aten::index         0.05%       1.151ms         0.10%       2.210ms      42.503us            52  
                                           aten::expand         0.07%       1.670ms         0.10%       2.195ms       3.346us           656  
                                            aten::addmm         0.06%       1.447ms         0.09%       2.126ms      70.873us            30  
                 aten::binary_cross_entropy_with_logits         0.00%      62.545us         0.08%       1.851ms     462.803us             4  
                              aten::upsample_bilinear2d         0.01%     333.709us         0.08%       1.841ms     131.469us            14  
                                aten::replication_pad3d         0.00%      71.759us         0.08%       1.832ms     915.865us             2  
                                          aten::permute         0.05%       1.233ms         0.07%       1.670ms       5.353us           312  
                                       aten::avg_pool2d         0.01%     130.910us         0.07%       1.666ms     277.674us             6

made some changes and pushed it above as those changes reduced the CPU usage to a level but still the worker crashes :disappointed_relieved:

The above cpu profile was after the latest try at fixing it. Before that the main consumption component, i.e., aten::uniform_ had the following share:

Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg       CPU Mem  Self CPU Mem    # of Calls  
-------------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  
                                         aten::uniform_        53.62%        2.068s        53.62%        2.068s      55.886ms           0 b           0 b            37

For benchmarking purposes I checked the same with other models per say I took ViLT and it just had 11.64% max of CPU usage at a time unlike the above. cc @amyeroberts @ydshieh

Sep 13 '24 15:09 RUFFY-369

Hi @RUFFY-369 Thank you for trying ❤️ (and running with torch.profiler.profile too!)

with default config you mean the config which is defined in the test file and model's config is what we will get with model.config, right?

No. What I suggest to compare is:

load the model from the Hub repository checkpoint (so the real model), and print its config real_model.config
the model created during a test, and print its config: tested_model.confg

I can take a closer look if you can provide the above one.

Sep 16 '24 14:09 ydshieh

Hi @RUFFY-369 Thank you for trying ❤️ (and running with torch.profiler.profile too!)

with default config you mean the config which is defined in the test file and model's config is what we will get with model.config, right?

No. What I suggest to compare is:

load the model from the Hub repository checkpoint (so the real model), and print its config real_model.config

the model created during a test, and print its config: tested_model.confg

I can take a closer look if you can provide the above one.

Hi @ydshieh , Thank you for your reply, help and the solution you proposed❤️. We got it done, cheers for that first of all :smiling_face_with_tear: :smile: . Secondly, it was indeed OOM resulting in worker crash initially. I got it sorted out when I dived in a little more by pinpointing the memory gobbler with torch.profiler.profile and fixing it. This fix will help in general inference with the model as well.

cc @amyeroberts @NielsRogge

Sep 17 '24 19:09 RUFFY-369

@RUFFY-369 I can see that there's still commits being pushed. Could you address the failing tests first, and then ping when ready for review? Let us know if there's any areas of the PR you have specific questions about or need any help with resolving the tests

@amyeroberts All the tests are GREEN :green_circle: and the PR is ready for review :+1:

Sep 17 '24 19:09 RUFFY-369

Glad it works!

Could you share what change fixed it?

Sep 18 '24 17:09 ydshieh

Also we need to trigger the slow CI

Before merging this pull request, slow tests CI should be triggered. To enable this:

Add the run-slow label to the PR (I did it now)
When your PR is ready for merge and all reviewers' comments have been addressed, push an empty commit with the command [run-slow] followed by a comma separated list of all the models to be tested, i.e. [run_slow] model_to_test_1, model_to_test_2
A transformers maintainer will then approve the workflow to start the tests

Sep 18 '24 17:09 ydshieh

Glad it works!

Could you share what change fixed it?

Yeah so basically when I looked back in the code with the profiler I noticed all the consumption was happening in just one class and that was where perceptual metric was created. In the original code and in this as well, VGG16 pretrained features from torchvision is used to calculate the perceptual loss. So, the worker crashes because in any mode, whether training or eval, the pretrained features were still loaded and those are 138 M parameters :smiling_face_with_tear: . So, I made changes here so that those pretrained features are loaded when the model is used for training to calculate the perceptual loss. :heart:

Sep 19 '24 08:09 RUFFY-369

Also we need to trigger the slow CI

Before merging this pull request, slow tests CI should be triggered. To enable this:

Add the run-slow label to the PR (I did it now)

Thank you for adding.

Before merging this pull request, slow tests CI should be triggered. To enable this:

When your PR is ready for merge and all reviewers' comments have been addressed, push an empty commit with the command [run-slow] followed by a comma separated list of all the models to be tested, i.e. [run_slow] model_to_test_1, model_to_test_2

A transformers maintainer will then approve the workflow to start the tests

Okay I will get this done when the PR is ready to merge and the reviews have been addressed completely. Thank you for mentioning :smile:

Sep 19 '24 08:09 RUFFY-369

@ydshieh I think you triggered a build run for PR docs and it failed. I addressed that in the latest commit tho. So, is that trigger run regarding PR docs also meant to be run when all the reviews are addressed?

Sep 19 '24 11:09 RUFFY-369

So, is that trigger run regarding PR docs also meant to be run when all the reviews are addressed?

We prefer to trigger (some) CIs/building jobs when the PR is (almost) ready, but it's not 100% strict :-). I can trigger the PR doc building job again.

Sep 19 '24 13:09 ydshieh

(also since this PR is a about new model, we don't need the 2nd step a commit with the command [run-slow] ...)

Sep 19 '24 13:09 ydshieh

(also since this PR is a about new model, we don't need the 2nd step a commit with the command [run-slow] ...)

Okay, noted :+1:

Sep 19 '24 13:09 RUFFY-369

So, is that trigger run regarding PR docs also meant to be run when all the reviews are addressed?

We prefer to trigger (some) CIs/building jobs when the PR is (almost) ready, but it's not 100% strict :-). I can trigger the PR doc building job again.

@ydshieh Thank you for the trigger build run. Just one question, like as of right now, two of them are failing so, can we address these build fails after triggering the jobs when the final review has been done and addressed?! Because even if I push the changes to fix them then you have to run it again and again for checking the fixes a few time and then that would be time consuming and inefficient for you. And also, code may change depending on the review

Sep 19 '24 14:09 RUFFY-369

can we address these build fails after triggering the jobs when the final review has been done and addressed

for sure !

Sep 19 '24 14:09 ydshieh

soft ping @molbap Thank you

Sep 30 '24 14:09 RUFFY-369

Add propainter

What does this PR do?

Before submitting

Who can review?

The PR is more than ready for first pass of review!!!

TODO(will be done in a fly :)):

Results:

Example usage is provided in the doc file here