OpenPCDet Custom dataset batch dimension problem during training.

Hello,

I’ve set up a custom point cloud dataset in KITTI format by following the OpenPCDet Custom Dataset Tutorial. However, I am experiencing issues related to the batch dimension during training.

Steps Taken: I initially suspected the issue might be related to uneven batch sizes, so I added the drop_last parameter to the DataLoader, but that didn’t resolve the problem. I also considered that the issue might be related to an uneven number of objects in each scene, so I filtered my dataset to ensure even numbers of objects per scene. I turned off shuffling in the DataLoader and added filename printing to log the specific file causing the error. Observations: Despite passing through the same files multiple times without issue, the error seems to occur randomly after a while, even though the same files that caused the error passed successfully earlier in the process. Question: Could anyone explain why this might be happening? It seems the issue is not consistently tied to a specific file or scene. Are there any suggestions on how to fix or debug this problem further?

Thank you for your assistance!

2024-09-10 14:15:20,896 INFO **********************Start logging********************** 2024-09-10 14:15:20,896 INFO CUDA_VISIBLE_DEVICES=ALL 2024-09-10 14:15:20,896 INFO Training with a single process 2024-09-10 14:15:20,896 INFO cfg_file cfgs/custom_models/pv_rcnn.yaml 2024-09-10 14:15:20,896 INFO batch_size 2 2024-09-10 14:15:20,896 INFO epochs 80 2024-09-10 14:15:20,896 INFO workers 4 2024-09-10 14:15:20,896 INFO extra_tag default 2024-09-10 14:15:20,896 INFO ckpt None 2024-09-10 14:15:20,896 INFO pretrained_model None 2024-09-10 14:15:20,896 INFO launcher none 2024-09-10 14:15:20,896 INFO tcp_port 18888 2024-09-10 14:15:20,897 INFO sync_bn False 2024-09-10 14:15:20,897 INFO fix_random_seed False 2024-09-10 14:15:20,897 INFO ckpt_save_interval 1 2024-09-10 14:15:20,897 INFO local_rank None 2024-09-10 14:15:20,897 INFO max_ckpt_save_num 30 2024-09-10 14:15:20,897 INFO merge_all_iters_to_one_epoch False 2024-09-10 14:15:20,897 INFO set_cfgs None 2024-09-10 14:15:20,897 INFO max_waiting_mins 0 2024-09-10 14:15:20,897 INFO start_epoch 0 2024-09-10 14:15:20,897 INFO num_epochs_to_eval 0 2024-09-10 14:15:20,897 INFO save_to_file False 2024-09-10 14:15:20,897 INFO use_tqdm_to_record False 2024-09-10 14:15:20,897 INFO logger_iter_interval 50 2024-09-10 14:15:20,897 INFO ckpt_save_time_interval 300 2024-09-10 14:15:20,897 INFO wo_gpu_stat False 2024-09-10 14:15:20,897 INFO use_amp False 2024-09-10 14:15:20,897 INFO cfg.ROOT_DIR: /home/ubuntu/v-detr/OpenPCDet 2024-09-10 14:15:20,897 INFO cfg.LOCAL_RANK: 0 2024-09-10 14:15:20,897 INFO cfg.CLASS_NAMES: ['Antenne4G', 'Antenne5G'] 2024-09-10 14:15:20,897 INFO ----------- MAP_CLASS_TO_KITTI ----------- 2024-09-10 14:15:20,897 INFO cfg.MAP_CLASS_TO_KITTI.Vehicle: Antenne4G 2024-09-10 14:15:20,897 INFO cfg.MAP_CLASS_TO_KITTI.Pedestrian: Antenne5G 2024-09-10 14:15:20,897 INFO ----------- DATA_CONFIG ----------- 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.DATASET: CustomDataset 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.DATA_PATH: ../data/custom 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.BASE_CONFIG: tools/cfgs/dataset_configs/custom_dataset.yaml 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.POINT_CLOUD_RANGE: [-8880, -16, 0, 96, 2684, 480] 2024-09-10 14:15:20,897 INFO ----------- DATA_SPLIT ----------- 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.DATA_SPLIT.train: train 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.DATA_SPLIT.test: val 2024-09-10 14:15:20,897 INFO ----------- INFO_PATH ----------- 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.INFO_PATH.train: ['custom_infos_train.pkl'] 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.INFO_PATH.test: ['custom_infos_val.pkl'] 2024-09-10 14:15:20,897 INFO ----------- POINT_FEATURE_ENCODING ----------- 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.POINT_FEATURE_ENCODING.encoding_type: absolute_coordinates_encoding 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.POINT_FEATURE_ENCODING.used_feature_list: ['x', 'y', 'z'] 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.POINT_FEATURE_ENCODING.src_feature_list: ['x', 'y', 'z'] 2024-09-10 14:15:20,897 INFO ----------- DATA_AUGMENTOR ----------- 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.DATA_AUGMENTOR.DISABLE_AUG_LIST: ['placeholder'] 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.DATA_AUGMENTOR.AUG_CONFIG_LIST: [{'NAME': 'gt_sampling', 'USE_ROAD_PLANE': False, 'DB_INFO_PATH': ['custom_dbinfos_train.pkl'], 'PREPARE': {}, 'SAMPLE_GROUPS': ['Antenne4G:5', 'Antenne5G:15'], 'NUM_POINT_FEATURES': 3, 'DATABASE_WITH_FAKELIDAR': False, 'REMOVE_EXTRA_WIDTH': [0.0, 0.0, 0.0], 'LIMIT_WHOLE_SCENE': True}, {'NAME': 'random_world_flip', 'ALONG_AXIS_LIST': ['x', 'y']}, {'NAME': 'random_world_rotation', 'WORLD_ROT_ANGLE': [-0.78539816, 0.78539816]}, {'NAME': 'random_world_scaling', 'WORLD_SCALE_RANGE': [0.95, 1.05]}] 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.DATA_PROCESSOR: [{'NAME': 'mask_points_and_boxes_outside_range', 'REMOVE_OUTSIDE_BOXES': True}, {'NAME': 'shuffle_points', 'SHUFFLE_ENABLED': {'train': True, 'test': False}}, {'NAME': 'transform_points_to_voxels', 'VOXEL_SIZE': [560, 168.75, 12], 'MAX_POINTS_PER_VOXEL': 5, 'MAX_NUMBER_OF_VOXELS': {'train': 5000, 'test': 1000}}] 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.CLASS_NAMES: ['Antenne4G', 'Antenne5G'] 2024-09-10 14:15:20,897 INFO ----------- OPTIMIZATION ----------- 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.BATCH_SIZE_PER_GPU: 2 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.NUM_EPOCHS: 80 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.OPTIMIZER: adam_onecycle 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.LR: 0.003 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.WEIGHT_DECAY: 0.01 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.MOMENTUM: 0.9 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.MOMS: [0.95, 0.85] 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.PCT_START: 0.4 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.OPTIMIZATION.DIV_FACTOR: 10 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.OPTIMIZATION.DECAY_STEP_LIST: [35, 45] 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.OPTIMIZATION.LR_DECAY: 0.1 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.OPTIMIZATION.LR_CLIP: 1e-07 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.OPTIMIZATION.LR_WARMUP: False 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.OPTIMIZATION.WARMUP_EPOCH: 1 2024-09-10 14:15:20,898 INFO ----------- MAP_CLASS_TO_KITTI ----------- 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.MAP_CLASS_TO_KITTI.Vehicle: Antenne4G 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.MAP_CLASS_TO_KITTI.Pedestrian: Antenne5G 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG._BASE_CONFIG_: ../tools/cfgs/dataset_configs/custom_dataset.yaml 2024-09-10 14:15:20,898 INFO ----------- MODEL ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.NAME: PVRCNN 2024-09-10 14:15:20,898 INFO ----------- VFE ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.VFE.NAME: MeanVFE 2024-09-10 14:15:20,898 INFO ----------- BACKBONE_3D ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.BACKBONE_3D.NAME: VoxelBackBone8x 2024-09-10 14:15:20,898 INFO ----------- MAP_TO_BEV ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.MAP_TO_BEV.NAME: HeightCompression 2024-09-10 14:15:20,898 INFO cfg.MODEL.MAP_TO_BEV.NUM_BEV_FEATURES: 256 2024-09-10 14:15:20,898 INFO ----------- BACKBONE_2D ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.BACKBONE_2D.NAME: BaseBEVBackbone 2024-09-10 14:15:20,898 INFO cfg.MODEL.BACKBONE_2D.LAYER_NUMS: [5, 5] 2024-09-10 14:15:20,898 INFO cfg.MODEL.BACKBONE_2D.LAYER_STRIDES: [1, 2] 2024-09-10 14:15:20,898 INFO cfg.MODEL.BACKBONE_2D.NUM_FILTERS: [64, 64] 2024-09-10 14:15:20,898 INFO cfg.MODEL.BACKBONE_2D.UPSAMPLE_STRIDES: [1, 2] 2024-09-10 14:15:20,898 INFO cfg.MODEL.BACKBONE_2D.NUM_UPSAMPLE_FILTERS: [256, 256] 2024-09-10 14:15:20,898 INFO ----------- DENSE_HEAD ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.NAME: AnchorHeadSingle 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.CLASS_AGNOSTIC: False 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.USE_DIRECTION_CLASSIFIER: True 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.DIR_OFFSET: 0.78539 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.DIR_LIMIT_OFFSET: 0.0 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.NUM_DIR_BINS: 2 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.ANCHOR_GENERATOR_CONFIG: [{'class_name': 'Antenne4G', 'anchor_sizes': [[0.5, 0.36, 2.64]], 'anchor_rotations': [0, 1.57], 'anchor_bottom_heights': [0], 'align_center': False, 'feature_map_stride': 8, 'matched_threshold': 0.55, 'unmatched_threshold': 0.4}, {'class_name': 'Antenne5G', 'anchor_sizes': [[0.4, 0.3, 1]], 'anchor_rotations': [0, 1.57], 'anchor_bottom_heights': [0], 'align_center': False, 'feature_map_stride': 8, 'matched_threshold': 0.5, 'unmatched_threshold': 0.35}] 2024-09-10 14:15:20,898 INFO ----------- TARGET_ASSIGNER_CONFIG ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.TARGET_ASSIGNER_CONFIG.NAME: AxisAlignedTargetAssigner 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.TARGET_ASSIGNER_CONFIG.POS_FRACTION: -1.0 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.TARGET_ASSIGNER_CONFIG.SAMPLE_SIZE: 512 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.TARGET_ASSIGNER_CONFIG.NORM_BY_NUM_EXAMPLES: False 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.TARGET_ASSIGNER_CONFIG.MATCH_HEIGHT: False 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.TARGET_ASSIGNER_CONFIG.BOX_CODER: ResidualCoder 2024-09-10 14:15:20,898 INFO ----------- LOSS_CONFIG ----------- 2024-09-10 14:15:20,898 INFO ----------- LOSS_WEIGHTS ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.cls_weight: 1.0 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.loc_weight: 2.0 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.dir_weight: 0.2 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.code_weights: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0] 2024-09-10 14:15:20,898 INFO ----------- PFE ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.PFE.NAME: VoxelSetAbstraction 2024-09-10 14:15:20,898 INFO cfg.MODEL.PFE.POINT_SOURCE: raw_points 2024-09-10 14:15:20,898 INFO cfg.MODEL.PFE.NUM_KEYPOINTS: 4096 2024-09-10 14:15:20,898 INFO cfg.MODEL.PFE.NUM_OUTPUT_FEATURES: 128 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SAMPLE_METHOD: FPS 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.FEATURES_SOURCE: ['bev', 'x_conv3', 'x_conv4'] 2024-09-10 14:15:20,899 INFO ----------- SA_LAYER ----------- 2024-09-10 14:15:20,899 INFO ----------- raw_points ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.raw_points.MLPS: [[16, 16], [16, 16]] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.raw_points.POOL_RADIUS: [0.4, 0.8] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.raw_points.NSAMPLE: [16, 16] 2024-09-10 14:15:20,899 INFO ----------- x_conv1 ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv1.DOWNSAMPLE_FACTOR: 1 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv1.MLPS: [[16, 16], [16, 16]] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv1.POOL_RADIUS: [0.4, 0.8] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv1.NSAMPLE: [16, 16] 2024-09-10 14:15:20,899 INFO ----------- x_conv2 ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv2.DOWNSAMPLE_FACTOR: 2 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv2.MLPS: [[32, 32], [32, 32]] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv2.POOL_RADIUS: [0.8, 1.2] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv2.NSAMPLE: [16, 32] 2024-09-10 14:15:20,899 INFO ----------- x_conv3 ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv3.DOWNSAMPLE_FACTOR: 4 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv3.MLPS: [[64, 64], [64, 64]] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv3.POOL_RADIUS: [1.2, 2.4] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv3.NSAMPLE: [16, 32] 2024-09-10 14:15:20,899 INFO ----------- x_conv4 ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv4.DOWNSAMPLE_FACTOR: 8 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv4.MLPS: [[64, 64], [64, 64]] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv4.POOL_RADIUS: [2.4, 4.8] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv4.NSAMPLE: [16, 32] 2024-09-10 14:15:20,899 INFO ----------- POINT_HEAD ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.POINT_HEAD.NAME: PointHeadSimple 2024-09-10 14:15:20,899 INFO cfg.MODEL.POINT_HEAD.CLS_FC: [256, 256] 2024-09-10 14:15:20,899 INFO cfg.MODEL.POINT_HEAD.CLASS_AGNOSTIC: True 2024-09-10 14:15:20,899 INFO cfg.MODEL.POINT_HEAD.USE_POINT_FEATURES_BEFORE_FUSION: True 2024-09-10 14:15:20,899 INFO ----------- TARGET_CONFIG ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.POINT_HEAD.TARGET_CONFIG.GT_EXTRA_WIDTH: [0.2, 0.2, 0.2] 2024-09-10 14:15:20,899 INFO ----------- LOSS_CONFIG ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.POINT_HEAD.LOSS_CONFIG.LOSS_REG: smooth-l1 2024-09-10 14:15:20,899 INFO ----------- LOSS_WEIGHTS ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.POINT_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.point_cls_weight: 1.0 2024-09-10 14:15:20,899 INFO ----------- ROI_HEAD ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.NAME: PVRCNNHead 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.CLASS_AGNOSTIC: True 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.SHARED_FC: [256, 256] 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.CLS_FC: [256, 256] 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.REG_FC: [256, 256] 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.DP_RATIO: 0.3 2024-09-10 14:15:20,899 INFO ----------- NMS_CONFIG ----------- 2024-09-10 14:15:20,899 INFO ----------- TRAIN ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TRAIN.NMS_TYPE: nms_gpu 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TRAIN.MULTI_CLASSES_NMS: False 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TRAIN.NMS_PRE_MAXSIZE: 9000 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TRAIN.NMS_POST_MAXSIZE: 512 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TRAIN.NMS_THRESH: 0.8 2024-09-10 14:15:20,899 INFO ----------- TEST ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TEST.NMS_TYPE: nms_gpu 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TEST.MULTI_CLASSES_NMS: False 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TEST.NMS_PRE_MAXSIZE: 4096 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TEST.NMS_POST_MAXSIZE: 300 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TEST.NMS_THRESH: 0.85 2024-09-10 14:15:20,900 INFO ----------- ROI_GRID_POOL ----------- 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.ROI_GRID_POOL.GRID_SIZE: 6 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.ROI_GRID_POOL.MLPS: [[64, 64], [64, 64]] 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.ROI_GRID_POOL.POOL_RADIUS: [0.8, 1.6] 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.ROI_GRID_POOL.NSAMPLE: [16, 16] 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.ROI_GRID_POOL.POOL_METHOD: max_pool 2024-09-10 14:15:20,900 INFO ----------- TARGET_CONFIG ----------- 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.BOX_CODER: ResidualCoder 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.ROI_PER_IMAGE: 128 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.FG_RATIO: 0.5 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.SAMPLE_ROI_BY_EACH_CLASS: True 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.CLS_SCORE_TYPE: roi_iou 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.CLS_FG_THRESH: 0.75 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.CLS_BG_THRESH: 0.25 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.CLS_BG_THRESH_LO: 0.1 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.HARD_BG_RATIO: 0.8 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.REG_FG_THRESH: 0.55 2024-09-10 14:15:20,900 INFO ----------- LOSS_CONFIG ----------- 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.LOSS_CONFIG.CLS_LOSS: BinaryCrossEntropy 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.LOSS_CONFIG.REG_LOSS: smooth-l1 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.LOSS_CONFIG.CORNER_LOSS_REGULARIZATION: True 2024-09-10 14:15:20,900 INFO ----------- LOSS_WEIGHTS ----------- 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.rcnn_cls_weight: 1.0 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.rcnn_reg_weight: 1.0 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.rcnn_corner_weight: 1.0 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.code_weights: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0] 2024-09-10 14:15:20,900 INFO ----------- POST_PROCESSING ----------- 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.RECALL_THRESH_LIST: [0.3, 0.5, 0.7] 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.SCORE_THRESH: 0.1 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.OUTPUT_RAW_SCORE: False 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.EVAL_METRIC: kitti 2024-09-10 14:15:20,900 INFO ----------- NMS_CONFIG ----------- 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.NMS_CONFIG.MULTI_CLASSES_NMS: False 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.NMS_CONFIG.NMS_TYPE: nms_gpu 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.NMS_CONFIG.NMS_THRESH: 0.1 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.NMS_CONFIG.NMS_PRE_MAXSIZE: 4096 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.NMS_CONFIG.NMS_POST_MAXSIZE: 500 2024-09-10 14:15:20,900 INFO ----------- OPTIMIZATION ----------- 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.BATCH_SIZE_PER_GPU: 2 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.NUM_EPOCHS: 80 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.OPTIMIZER: adam_onecycle 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.LR: 0.01 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.WEIGHT_DECAY: 0.01 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.MOMENTUM: 0.9 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.MOMS: [0.95, 0.85] 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.PCT_START: 0.4 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.DIV_FACTOR: 10 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.DECAY_STEP_LIST: [35, 45] 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.LR_DECAY: 0.1 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.LR_CLIP: 1e-07 2024-09-10 14:15:20,901 INFO cfg.OPTIMIZATION.LR_WARMUP: False 2024-09-10 14:15:20,901 INFO cfg.OPTIMIZATION.WARMUP_EPOCH: 1 2024-09-10 14:15:20,901 INFO cfg.OPTIMIZATION.GRAD_NORM_CLIP: 10 2024-09-10 14:15:20,901 INFO cfg.TAG: pv_rcnn 2024-09-10 14:15:20,901 INFO cfg.EXP_GROUP_PATH: custom_models 2 My batch size 2024-09-10 14:15:20,903 INFO ----------- Create dataloader & network & optimizer ----------- 2024-09-10 14:15:20,904 INFO Loading Custom dataset. 2024-09-10 14:15:20,904 INFO Total samples for CUSTOM dataset: 18 /home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2894.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] 2024-09-10 14:15:23,546 INFO ----------- Model PVRCNN created, param count: 9105861 ----------- 2024-09-10 14:15:23,546 INFO PVRCNN( (vfe): MeanVFE() (backbone_3d): VoxelBackBone8x( (conv_input): SparseSequential( (0): SubMConv3d(3, 16, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[1, 1, 1], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(16, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (conv1): SparseSequential( (0): SparseSequential( (0): SubMConv3d(16, 16, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(16, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) ) (conv2): SparseSequential( (0): SparseSequential( (0): SparseConv3d(16, 32, kernel_size=[3, 3, 3], stride=[2, 2, 2], padding=[1, 1, 1], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(32, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (1): SparseSequential( (0): SubMConv3d(32, 32, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(32, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (2): SparseSequential( (0): SubMConv3d(32, 32, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(32, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) ) (conv3): SparseSequential( (0): SparseSequential( (0): SparseConv3d(32, 64, kernel_size=[3, 3, 3], stride=[2, 2, 2], padding=[1, 1, 1], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (1): SparseSequential( (0): SubMConv3d(64, 64, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (2): SparseSequential( (0): SubMConv3d(64, 64, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) ) (conv4): SparseSequential( (0): SparseSequential( (0): SparseConv3d(64, 64, kernel_size=[3, 3, 3], stride=[2, 2, 2], padding=[0, 1, 1], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (1): SparseSequential( (0): SubMConv3d(64, 64, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (2): SparseSequential( (0): SubMConv3d(64, 64, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) ) (conv_out): SparseSequential( (0): SparseConv3d(64, 128, kernel_size=[3, 1, 1], stride=[2, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(128, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) ) (map_to_bev_module): HeightCompression() (pfe): VoxelSetAbstraction( (SA_layers): ModuleList( (0): StackSAModuleMSG( (groupers): ModuleList( (0): QueryAndGroup() (1): QueryAndGroup() ) (mlps): ModuleList( (0): Sequential( (0): Conv2d(67, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU() ) (1): Sequential( (0): Conv2d(67, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU() ) ) ) (1): StackSAModuleMSG( (groupers): ModuleList( (0): QueryAndGroup() (1): QueryAndGroup() ) (mlps): ModuleList( (0): Sequential( (0): Conv2d(67, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU() ) (1): Sequential( (0): Conv2d(67, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU() ) ) ) ) (vsa_point_feature_fusion): Sequential( (0): Linear(in_features=512, out_features=128, bias=False) (1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() ) ) (backbone_2d): BaseBEVBackbone( (blocks): ModuleList( (0): Sequential( (0): ZeroPad2d((1, 1, 1, 1)) (1): Conv2d(256, 64, kernel_size=(3, 3), stride=(1, 1), bias=False) (2): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (3): ReLU() (4): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (5): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (6): ReLU() (7): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (8): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (9): ReLU() (10): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (11): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (12): ReLU() (13): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (14): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (15): ReLU() (16): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (17): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (18): ReLU() ) (1): Sequential( (0): ZeroPad2d((1, 1, 1, 1)) (1): Conv2d(64, 64, kernel_size=(3, 3), stride=(2, 2), bias=False) (2): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (3): ReLU() (4): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (5): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (6): ReLU() (7): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (8): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (9): ReLU() (10): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (11): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (12): ReLU() (13): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (14): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (15): ReLU() (16): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (17): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (18): ReLU() ) ) (deblocks): ModuleList( (0): Sequential( (0): ConvTranspose2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(256, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (1): Sequential( (0): ConvTranspose2d(64, 256, kernel_size=(2, 2), stride=(2, 2), bias=False) (1): BatchNorm2d(256, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) ) ) (dense_head): AnchorHeadSingle( (cls_loss_func): SigmoidFocalClassificationLoss() (reg_loss_func): WeightedSmoothL1Loss() (dir_loss_func): WeightedCrossEntropyLoss() (conv_cls): Conv2d(512, 8, kernel_size=(1, 1), stride=(1, 1)) (conv_box): Conv2d(512, 28, kernel_size=(1, 1), stride=(1, 1)) (conv_dir_cls): Conv2d(512, 8, kernel_size=(1, 1), stride=(1, 1)) ) (point_head): PointHeadSimple( (cls_loss_func): SigmoidFocalClassificationLoss() (cls_layers): Sequential( (0): Linear(in_features=512, out_features=256, bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Linear(in_features=256, out_features=256, bias=False) (4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU() (6): Linear(in_features=256, out_features=1, bias=True) ) ) (roi_head): PVRCNNHead( (proposal_target_layer): ProposalTargetLayer() (reg_loss_func): WeightedSmoothL1Loss() (roi_grid_pool_layer): StackSAModuleMSG( (groupers): ModuleList( (0): QueryAndGroup() (1): QueryAndGroup() ) (mlps): ModuleList( (0): Sequential( (0): Conv2d(131, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU() ) (1): Sequential( (0): Conv2d(131, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU() ) ) ) (shared_fc_layer): Sequential( (0): Conv1d(27648, 256, kernel_size=(1,), stride=(1,), bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Dropout(p=0.3, inplace=False) (4): Conv1d(256, 256, kernel_size=(1,), stride=(1,), bias=False) (5): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (6): ReLU() ) (cls_layers): Sequential( (0): Conv1d(256, 256, kernel_size=(1,), stride=(1,), bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Dropout(p=0.3, inplace=False) (4): Conv1d(256, 256, kernel_size=(1,), stride=(1,), bias=False) (5): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (6): ReLU() (7): Conv1d(256, 1, kernel_size=(1,), stride=(1,)) ) (reg_layers): Sequential( (0): Conv1d(256, 256, kernel_size=(1,), stride=(1,), bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Dropout(p=0.3, inplace=False) (4): Conv1d(256, 256, kernel_size=(1,), stride=(1,), bias=False) (5): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (6): ReLU() (7): Conv1d(256, 7, kernel_size=(1,), stride=(1,)) ) ) ) 2024-09-10 14:15:23,548 INFO **********************Start training custom_models/pv_rcnn(default)********************** epochs: 0%| | 0/80 [00:00<?, ?it/sCurrent_batch size 2 | 0/9 [00:00<?, ?it/s] ['000000' '000001'] current batch file name 2024-09-10 14:15:25,937 INFO Train: 1/80 ( 1%) [ 0/9 ( 0%)] Loss: 342.2 (342.) LR: 1.000e-03 Time cost: 00:02/00:20 [00:02/26:45] Acc_iter 1 Data time: 0.04(0.04) Forward time: 2.19(2.19) Batch time: 2.23(2.23) Current_batch size 2 ['000002' '000004'] current batch file name Current_batch size 2 ['000005' '000006'] current batch file name Current_batch size 2 ['000008' '000009'] current batch file name Current_batch size 2 ['000011' '000012'] current batch file name Current_batch size 2 ['000013' '000015'] current batch file name Current_batch size 2 ['000016' '000017'] current batch file name Current_batch size 2 ['000018' '000019'] current batch file name Current_batch size 2 ['000020' '000021'] current batch file name 2024-09-10 14:15:30,571 INFO Train: 1/80 ( 1%) [ 8/9 ( 89%)] Loss: 505.2 (801.) LR: 1.017e-03 Time cost: 00:06/00:00 [00:07/09:03] Acc_iter 9 Data time: 0.00(0.01) Forward time: 0.60(0.75) Batch time: 0.60(0.76) epochs: 1%|█▉ | 1/80 [00:07<09:29, 7.20s/itCurrent_batch size 2 | 0/9 [00:00<?, ?it/s] ['000000' '000001'] current batch file name 2024-09-10 14:15:31,543 INFO Train: 2/80 ( 2%) [ 0/9 ( 0%)] Loss: 252.3 (252.) LR: 1.022e-03 Time cost: 00:00/00:06 [00:07/08:21] Acc_iter 10 Data time: 0.08(0.08) Forward time: 0.62(0.62) Batch time: 0.70(0.70) Current_batch size 2 ['000002' '000004'] current batch file name Current_batch size 2 ['000005' '000006'] current batch file name Current_batch size 2 ['000008' '000009'] current batch file name Current_batch size 2 ['000011' '000012'] current batch file name Current_batch size 2 ['000013' '000015'] current batch file name Current_batch size 2 ['000016' '000017'] current batch file name Current_batch size 2 ['000018' '000019'] current batch file name Current_batch size 2 ['000020' '000021'] current batch file name 2024-09-10 14:15:36,277 INFO Train: 2/80 ( 2%) [ 8/9 ( 89%)] Loss: 331.3 (682.) LR: 1.077e-03 Time cost: 00:05/00:00 [00:12/07:04] Acc_iter 18 Data time: 0.00(0.01) Forward time: 0.60(0.59) Batch time: 0.60(0.60) epochs: 2%|███▊ | 2/80 [00:13<08:20, 6.41s/itCurrent_batch size 2 | 0/9 [00:00<?, ?it/s] ['000000' '000001'] current batch file name 2024-09-10 14:15:37,409 INFO Train: 3/80 ( 4%) [ 0/9 ( 0%)] Loss: 99.24 (99.2) LR: 1.086e-03 Time cost: 00:00/00:06 [00:13/08:24] Acc_iter 19 Data time: 0.09(0.09) Forward time: 0.63(0.63) Batch time: 0.72(0.72) Current_batch size 2 ['000002' '000004'] current batch file name Current_batch size 2 ['000005' '000006'] current batch file name Current_batch size 2 ['000008' '000009'] current batch file name Current_batch size 2 ['000011' '000012'] current batch file name Current_batch size 2 ['000013' '000015'] current batch file name Current_batch size 2 ['000016' '000017'] current batch file name Current_batch size 2 ['000018' '000019'] current batch file name Current_batch size 2 ['000020' '000021'] current batch file name 2024-09-10 14:15:42,148 INFO Train: 3/80 ( 4%) [ 8/9 ( 89%)] Loss: 244.1 (804.) LR: 1.180e-03 Time cost: 00:05/00:00 [00:18/07:00] Acc_iter 27 Data time: 0.00(0.01) Forward time: 0.60(0.59) Batch time: 0.60(0.61) epochs: 4%|█████▋ | 3/80 [00:18<07:50, 6.12s/itCurrent_batch size 2 | 0/9 [00:00<?, ?it/s] ['000000' '000001'] current batch file name 2024-09-10 14:15:43,158 INFO Train: 4/80 ( 5%) [ 0/9 ( 0%)] Loss: 487.5 (487.) LR: 1.194e-03 Time cost: 00:00/00:06 [00:19/08:07] Acc_iter 28 Data time: 0.07(0.07) Forward time: 0.63(0.63) Batch time: 0.70(0.70) Current_batch size 2 ['000002' '000004'] current batch file name Current_batch size 2 ['000005' '000006'] current batch file name Current_batch size 2 ['000008' '000009'] current batch file name Current_batch size 2 ['000011' '000012'] current batch file name Current_batch size 2 ['000013' '000015'] current batch file name Current_batch size 2 ['000016' '000017'] current batch file name Current_batch size 2 ['000018' '000019'] current batch file name Current_batch size 2 ['000020' '000021'] current batch file name 2024-09-10 14:15:47,780 INFO Train: 4/80 ( 5%) [ 8/9 ( 89%)] Loss: 478.9 (698.) LR: 1.324e-03 Time cost: 00:05/00:00 [00:24/06:45] Acc_iter 36 Data time: 0.00(0.01) Forward time: 0.50(0.58) Batch time: 0.50(0.59) epochs: 5%|███████▋ | 4/80 [00:24<07:34, 5.98s/itCurrent_batch size 2 | 0/9 [00:00<?, ?it/s] ['000000' '000001'] current batch file name 2024-09-10 14:15:48,960 INFO Train: 5/80 ( 6%) [ 0/9 ( 0%)] Loss: 239.0 (239.) LR: 1.343e-03 Time cost: 00:00/00:06 [00:25/08:16] Acc_iter 37 Data time: 0.08(0.08) Forward time: 0.64(0.64) Batch time: 0.72(0.72) Current_batch size 2 ['000002' '000004'] current batch file name Current_batch size 2 ['000005' '000006'] current batch file name Current_batch size 2 ['000008' '000009'] current batch file name Current_batch size 2 ['000011' '000012'] current batch file name Current_batch size 2 ['000013' '000015'] current batch file name Current_batch size 2 ['000016' '000017'] current batch file name Current_batch size 2 ['000018' '000019'] current batch file name Current_batch size 2 ['000020' '000021'] current batch file name 2024-09-10 14:15:53,661 INFO Train: 5/80 ( 6%) [ 8/9 ( 89%)] Loss: 312.4 (641.) LR: 1.508e-03 Time cost: 00:05/00:00 [00:30/06:47] Acc_iter 45 Data time: 0.00(0.01) Forward time: 0.60(0.59) Batch time: 0.61(0.60) epochs: 6%|█████████▌ | 5/80 [00:30<07:24, 5.93s/itCurrent_batch size 2 | 0/9 [00:00<?, ?it/s] ['000000' '000001'] current batch file name 2024-09-10 14:15:54,762 INFO Train: 6/80 ( 8%) [ 0/9 ( 0%)] Loss: 311.6 (312.) LR: 1.531e-03 Time cost: 00:00/00:06 [00:31/07:50] Acc_iter 46 Data time: 0.08(0.08) Forward time: 0.62(0.62) Batch time: 0.70(0.70) Current_batch size 2 ['000002' '000004'] current batch file name Current_batch size 2 ['000005' '000006'] current batch file name Current_batch size 2 ['000008' '000009'] current batch file name Current_batch size 2 ['000011' '000012'] current batch file name 2024-09-10 14:15:57,071 INFO Train: 6/80 ( 8%) [ 4/9 ( 44%)] Loss: 925.3 (666.) LR: 1.628e-03 Time cost: 00:03/00:03 [00:33/06:43] Acc_iter 50 Data time: 0.00(0.02) Forward time: 0.59(0.58) Batch time: 0.60(0.60) Current_batch size 2 ['000013' '000015'] current batch file name Current_batch size 2 ['000016' '000017'] current batch file name Current_batch size 2 ['000018' '000019'] current batch file name Current_batch size 2 ['000020' '000021'] current batch file name 2024-09-10 14:15:59,269 INFO Train: 6/80 ( 8%) [ 8/9 ( 89%)] Loss: 0.1672 (677.) LR: 1.731e-03 Time cost: 00:05/00:00 [00:35/06:25] Acc_iter 54 Data time: 0.00(0.01) Forward time: 0.50(0.57) Batch time: 0.50(0.58) epochs: 8%|███████████▍ | 6/80 [00:36<07:11, 5.83s/itCurrent_batch size 2 | 0/9 [00:00<?, ?it/s] ['000000' '000001'] current batch file name 2024-09-10 14:16:00,407 INFO Train: 7/80 ( 9%) [ 0/9 ( 0%)] Loss: 303.7 (304.) LR: 1.758e-03 Time cost: 00:00/00:06 [00:36/07:51] Acc_iter 55 Data time: 0.08(0.08) Forward time: 0.62(0.62) Batch time: 0.71(0.71) Current_batch size 2 ['000002' '000004'] current batch file name Current_batch size 2 ['000005' '000006'] current batch file name Current_batch size 2 ['000008' '000009'] current batch file name Current_batch size 2 ['000011' '000012'] current batch file name Current_batch size 2 ['000013' '000015'] current batch file name Current_batch size 2 ['000016' '000017'] current batch file name Current_batch size 2 ['000018' '000019'] current batch file name Current_batch size 2 ['000020' '000021'] current batch file name epochs: 8%|███████████▍ | 6/80 [00:40<08:24, 6.82s/it] Traceback (most recent call last): File "/home/ubuntu/v-detr/OpenPCDet/tools/train.py", line 234, in <module> main() File "/home/ubuntu/v-detr/OpenPCDet/tools/train.py", line 178, in main train_model( File "/home/ubuntu/v-detr/OpenPCDet/tools/train_utils/train_utils.py", line 189, in train_model accumulated_iter = train_one_epoch( File "/home/ubuntu/v-detr/OpenPCDet/tools/train_utils/train_utils.py", line 65, in train_one_epoch loss, tb_dict, disp_dict = model_func(model, batch) File "/home/ubuntu/v-detr/OpenPCDet/tools/../pcdet/models/__init__.py", line 46, in model_func ret_dict, tb_dict, disp_dict = model(batch_dict) File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/ubuntu/v-detr/OpenPCDet/tools/../pcdet/models/detectors/pv_rcnn.py", line 19, in forward batch_dict = cur_module(batch_dict) File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/ubuntu/v-detr/OpenPCDet/tools/../pcdet/models/backbones_3d/spconv_backbone.py", line 159, in forward x = self.conv_input(input_sp_tensor) File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/spconv/pytorch/modules.py", line 142, in forward input = input.replace_feature(module(input.features)) File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/nn/modules/batchnorm.py", line 168, in forward return F.batch_norm( File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/nn/functional.py", line 2436, in batch_norm _verify_batch_size(input.size()) File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/nn/functional.py", line 2404, in _verify_batch_size raise ValueError("Expected more than 1 value per channel when training, got input size {}".format(size)) ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 16])

Sep 10 '24 14:09 Serzhanov

This issue is stale because it has been open for 30 days with no activity.

Oct 11 '24 01:10 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

Oct 25 '24 01:10 github-actions[bot]

@Serzhanov did you ever manage to fix this? I am experiencing the same issue

Feb 26 '25 13:02 MarvinKlemp