chenrui17 issues

Results 7 issues of


                                            chenrui17

大batch_size和多目标检测的双层for循环导致ATSS算法执行时间长，GPU利用率低，如何提升性能与效率？

如题，batch _size = 15，检测目标类别=6，则atss算法会连续调用90次，通过profile，该部分GPU利用率低，耗时占比训练 30%左右，如何考虑优化，从而提升性能与GPU利用率呢？

target assign cost too much，and gpu util is low, how to improve it ?

how to improve target assignment operation when set batch_size = N like `15`, class num = 6 or more bigger, function `assign_target_single` will be called 90 times , and it's...

stale

Why does the pointpillars net not have voxelization operation at the beginning?

I am a begginer of 3D obejct detection, i compared some pointpillars relization from det3d , mmdetection3D and PointPillars(https://github.com/zhulf0804), such as: mmdetection3d : **voxelization** -> PillarFeatureNet -> PointPillarsScatter -> SECONDFPN...

Does it currently support distributed multi card training ？

Will it be supported in the future ？ current single card training cost too much time

multi gpu training result is incorrect , how to correct it ?

my trainning result with single gpu is like this ![image](https://user-images.githubusercontent.com/33319780/201667141-575cf44f-d8d7-4cab-8b8d-9ee258b82e8a.png) but multi gpu training result is like this ![image](https://user-images.githubusercontent.com/33319780/201667354-e2e249f0-5b63-48c9-9e47-f7291d56caa6.png) can you help me see what's wrong when multi gpu mode...

fuzzy_dedup OOM issue

**Describe the bug** Use 5*A100 GPUs to do fuzzey_dedup task and encountered OOM issues. here is error info ``` 2024-12-31 05:02:43,370 - distributed.worker - ERROR - Could not serialize object...

bug

jira

TRT inference poor performance v.s. pytorch with dino model

train model : [dino link](https://github.com/open-mmlab/mmdetection/blob/main/configs/dino/dino-5scale_swin-l_8xb2-12e_coco.py) firstly, use [mmdeploy](https://github.com/open-mmlab/mmdeploy) convert pytorch model to onnx format, secondly, use Trt builder to generate engine. finally, use `execute_async_v2` method to inference, but result performance...

triaged