Manuel Alejandro Díaz Zapata
Manuel Alejandro Díaz Zapata
I think maybe other network architecture would suit better road segmentation tasks. [This Pytorch repository](https://github.com/meetshah1995/pytorch-semseg) has some interesting code, I've worked with it and it is really intuitive. [This is...
Maybe to add robustness to the system to not depend on all cameras for each segment of the final grid? That way you can also see how much the other...
Hello @VeeranjaneyuluToka The idea here is that by using the intrinsics, extrinsics, predicted categorical depth and sum pooling, you could theoretically do the projection of features coming from any number...
First, let me start from what I (and the authors, I think) mean with categorical depth. Imagine that you would like to predict what objects are in front of your...
> > > > Hi man, have you understood the codes for projecting 2D images to 3D BEV, because I have same questions. How to make frustum shape reconstruction rather...
Because this is the BEV representation, not a camera image. So it represents the space around the vehicle
The `get_geometry` function uses the intrinsic and extrinsic matrices of the camera together with the different defined depths. This way we can see where each pixel is looking to using...
Take a look at this [paper](https://arxiv.org/pdf/1903.08960.pdf), where in equation 2 they describe how the projection is done from camera to 3D. In their case, they get the depth from a...
Context vector are the features. Both of them are generated in the last layer of the Camera encoder. Here you can see the context vector, called [`depth`](https://github.com/nv-tlabs/lift-splat-shoot/blob/master/src/models.py#L56) and the features...
> Extract features from each input image sample by considering the last 2 blocks of efficientnet (reduction_5 and reduction_4), lets say they are x1 and x2. Yes, that is what...