Custom dataset batch dimension problem during training.
Hello,
I’ve set up a custom point cloud dataset in KITTI format by following the OpenPCDet Custom Dataset Tutorial. However, I am experiencing issues related to the batch dimension during training.
Steps Taken: I initially suspected the issue might be related to uneven batch sizes, so I added the drop_last parameter to the DataLoader, but that didn’t resolve the problem. I also considered that the issue might be related to an uneven number of objects in each scene, so I filtered my dataset to ensure even numbers of objects per scene. I turned off shuffling in the DataLoader and added filename printing to log the specific file causing the error. Observations: Despite passing through the same files multiple times without issue, the error seems to occur randomly after a while, even though the same files that caused the error passed successfully earlier in the process. Question: Could anyone explain why this might be happening? It seems the issue is not consistently tied to a specific file or scene. Are there any suggestions on how to fix or debug this problem further?
Thank you for your assistance!
2024-09-10 14:15:20,896 INFO **********************Start logging********************** 2024-09-10 14:15:20,896 INFO CUDA_VISIBLE_DEVICES=ALL 2024-09-10 14:15:20,896 INFO Training with a single process 2024-09-10 14:15:20,896 INFO cfg_file cfgs/custom_models/pv_rcnn.yaml 2024-09-10 14:15:20,896 INFO batch_size 2 2024-09-10 14:15:20,896 INFO epochs 80 2024-09-10 14:15:20,896 INFO workers 4 2024-09-10 14:15:20,896 INFO extra_tag default 2024-09-10 14:15:20,896 INFO ckpt None 2024-09-10 14:15:20,896 INFO pretrained_model None 2024-09-10 14:15:20,896 INFO launcher none 2024-09-10 14:15:20,896 INFO tcp_port 18888 2024-09-10 14:15:20,897 INFO sync_bn False 2024-09-10 14:15:20,897 INFO fix_random_seed False 2024-09-10 14:15:20,897 INFO ckpt_save_interval 1 2024-09-10 14:15:20,897 INFO local_rank None 2024-09-10 14:15:20,897 INFO max_ckpt_save_num 30 2024-09-10 14:15:20,897 INFO merge_all_iters_to_one_epoch False 2024-09-10 14:15:20,897 INFO set_cfgs None 2024-09-10 14:15:20,897 INFO max_waiting_mins 0 2024-09-10 14:15:20,897 INFO start_epoch 0 2024-09-10 14:15:20,897 INFO num_epochs_to_eval 0 2024-09-10 14:15:20,897 INFO save_to_file False 2024-09-10 14:15:20,897 INFO use_tqdm_to_record False 2024-09-10 14:15:20,897 INFO logger_iter_interval 50 2024-09-10 14:15:20,897 INFO ckpt_save_time_interval 300 2024-09-10 14:15:20,897 INFO wo_gpu_stat False 2024-09-10 14:15:20,897 INFO use_amp False 2024-09-10 14:15:20,897 INFO cfg.ROOT_DIR: /home/ubuntu/v-detr/OpenPCDet 2024-09-10 14:15:20,897 INFO cfg.LOCAL_RANK: 0 2024-09-10 14:15:20,897 INFO cfg.CLASS_NAMES: ['Antenne4G', 'Antenne5G'] 2024-09-10 14:15:20,897 INFO ----------- MAP_CLASS_TO_KITTI ----------- 2024-09-10 14:15:20,897 INFO cfg.MAP_CLASS_TO_KITTI.Vehicle: Antenne4G 2024-09-10 14:15:20,897 INFO cfg.MAP_CLASS_TO_KITTI.Pedestrian: Antenne5G 2024-09-10 14:15:20,897 INFO ----------- DATA_CONFIG ----------- 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.DATASET: CustomDataset 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.DATA_PATH: ../data/custom 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.BASE_CONFIG: tools/cfgs/dataset_configs/custom_dataset.yaml 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.POINT_CLOUD_RANGE: [-8880, -16, 0, 96, 2684, 480] 2024-09-10 14:15:20,897 INFO ----------- DATA_SPLIT ----------- 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.DATA_SPLIT.train: train 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.DATA_SPLIT.test: val 2024-09-10 14:15:20,897 INFO ----------- INFO_PATH ----------- 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.INFO_PATH.train: ['custom_infos_train.pkl'] 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.INFO_PATH.test: ['custom_infos_val.pkl'] 2024-09-10 14:15:20,897 INFO ----------- POINT_FEATURE_ENCODING ----------- 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.POINT_FEATURE_ENCODING.encoding_type: absolute_coordinates_encoding 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.POINT_FEATURE_ENCODING.used_feature_list: ['x', 'y', 'z'] 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.POINT_FEATURE_ENCODING.src_feature_list: ['x', 'y', 'z'] 2024-09-10 14:15:20,897 INFO ----------- DATA_AUGMENTOR ----------- 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.DATA_AUGMENTOR.DISABLE_AUG_LIST: ['placeholder'] 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.DATA_AUGMENTOR.AUG_CONFIG_LIST: [{'NAME': 'gt_sampling', 'USE_ROAD_PLANE': False, 'DB_INFO_PATH': ['custom_dbinfos_train.pkl'], 'PREPARE': {}, 'SAMPLE_GROUPS': ['Antenne4G:5', 'Antenne5G:15'], 'NUM_POINT_FEATURES': 3, 'DATABASE_WITH_FAKELIDAR': False, 'REMOVE_EXTRA_WIDTH': [0.0, 0.0, 0.0], 'LIMIT_WHOLE_SCENE': True}, {'NAME': 'random_world_flip', 'ALONG_AXIS_LIST': ['x', 'y']}, {'NAME': 'random_world_rotation', 'WORLD_ROT_ANGLE': [-0.78539816, 0.78539816]}, {'NAME': 'random_world_scaling', 'WORLD_SCALE_RANGE': [0.95, 1.05]}] 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.DATA_PROCESSOR: [{'NAME': 'mask_points_and_boxes_outside_range', 'REMOVE_OUTSIDE_BOXES': True}, {'NAME': 'shuffle_points', 'SHUFFLE_ENABLED': {'train': True, 'test': False}}, {'NAME': 'transform_points_to_voxels', 'VOXEL_SIZE': [560, 168.75, 12], 'MAX_POINTS_PER_VOXEL': 5, 'MAX_NUMBER_OF_VOXELS': {'train': 5000, 'test': 1000}}] 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.CLASS_NAMES: ['Antenne4G', 'Antenne5G'] 2024-09-10 14:15:20,897 INFO ----------- OPTIMIZATION ----------- 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.BATCH_SIZE_PER_GPU: 2 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.NUM_EPOCHS: 80 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.OPTIMIZER: adam_onecycle 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.LR: 0.003 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.WEIGHT_DECAY: 0.01 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.MOMENTUM: 0.9 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.MOMS: [0.95, 0.85] 2024-09-10 14:15:20,897 INFO cfg.DATA_CONFIG.OPTIMIZATION.PCT_START: 0.4 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.OPTIMIZATION.DIV_FACTOR: 10 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.OPTIMIZATION.DECAY_STEP_LIST: [35, 45] 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.OPTIMIZATION.LR_DECAY: 0.1 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.OPTIMIZATION.LR_CLIP: 1e-07 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.OPTIMIZATION.LR_WARMUP: False 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.OPTIMIZATION.WARMUP_EPOCH: 1 2024-09-10 14:15:20,898 INFO ----------- MAP_CLASS_TO_KITTI ----------- 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.MAP_CLASS_TO_KITTI.Vehicle: Antenne4G 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG.MAP_CLASS_TO_KITTI.Pedestrian: Antenne5G 2024-09-10 14:15:20,898 INFO cfg.DATA_CONFIG._BASE_CONFIG_: ../tools/cfgs/dataset_configs/custom_dataset.yaml 2024-09-10 14:15:20,898 INFO ----------- MODEL ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.NAME: PVRCNN 2024-09-10 14:15:20,898 INFO ----------- VFE ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.VFE.NAME: MeanVFE 2024-09-10 14:15:20,898 INFO ----------- BACKBONE_3D ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.BACKBONE_3D.NAME: VoxelBackBone8x 2024-09-10 14:15:20,898 INFO ----------- MAP_TO_BEV ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.MAP_TO_BEV.NAME: HeightCompression 2024-09-10 14:15:20,898 INFO cfg.MODEL.MAP_TO_BEV.NUM_BEV_FEATURES: 256 2024-09-10 14:15:20,898 INFO ----------- BACKBONE_2D ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.BACKBONE_2D.NAME: BaseBEVBackbone 2024-09-10 14:15:20,898 INFO cfg.MODEL.BACKBONE_2D.LAYER_NUMS: [5, 5] 2024-09-10 14:15:20,898 INFO cfg.MODEL.BACKBONE_2D.LAYER_STRIDES: [1, 2] 2024-09-10 14:15:20,898 INFO cfg.MODEL.BACKBONE_2D.NUM_FILTERS: [64, 64] 2024-09-10 14:15:20,898 INFO cfg.MODEL.BACKBONE_2D.UPSAMPLE_STRIDES: [1, 2] 2024-09-10 14:15:20,898 INFO cfg.MODEL.BACKBONE_2D.NUM_UPSAMPLE_FILTERS: [256, 256] 2024-09-10 14:15:20,898 INFO ----------- DENSE_HEAD ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.NAME: AnchorHeadSingle 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.CLASS_AGNOSTIC: False 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.USE_DIRECTION_CLASSIFIER: True 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.DIR_OFFSET: 0.78539 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.DIR_LIMIT_OFFSET: 0.0 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.NUM_DIR_BINS: 2 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.ANCHOR_GENERATOR_CONFIG: [{'class_name': 'Antenne4G', 'anchor_sizes': [[0.5, 0.36, 2.64]], 'anchor_rotations': [0, 1.57], 'anchor_bottom_heights': [0], 'align_center': False, 'feature_map_stride': 8, 'matched_threshold': 0.55, 'unmatched_threshold': 0.4}, {'class_name': 'Antenne5G', 'anchor_sizes': [[0.4, 0.3, 1]], 'anchor_rotations': [0, 1.57], 'anchor_bottom_heights': [0], 'align_center': False, 'feature_map_stride': 8, 'matched_threshold': 0.5, 'unmatched_threshold': 0.35}] 2024-09-10 14:15:20,898 INFO ----------- TARGET_ASSIGNER_CONFIG ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.TARGET_ASSIGNER_CONFIG.NAME: AxisAlignedTargetAssigner 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.TARGET_ASSIGNER_CONFIG.POS_FRACTION: -1.0 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.TARGET_ASSIGNER_CONFIG.SAMPLE_SIZE: 512 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.TARGET_ASSIGNER_CONFIG.NORM_BY_NUM_EXAMPLES: False 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.TARGET_ASSIGNER_CONFIG.MATCH_HEIGHT: False 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.TARGET_ASSIGNER_CONFIG.BOX_CODER: ResidualCoder 2024-09-10 14:15:20,898 INFO ----------- LOSS_CONFIG ----------- 2024-09-10 14:15:20,898 INFO ----------- LOSS_WEIGHTS ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.cls_weight: 1.0 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.loc_weight: 2.0 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.dir_weight: 0.2 2024-09-10 14:15:20,898 INFO cfg.MODEL.DENSE_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.code_weights: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0] 2024-09-10 14:15:20,898 INFO ----------- PFE ----------- 2024-09-10 14:15:20,898 INFO cfg.MODEL.PFE.NAME: VoxelSetAbstraction 2024-09-10 14:15:20,898 INFO cfg.MODEL.PFE.POINT_SOURCE: raw_points 2024-09-10 14:15:20,898 INFO cfg.MODEL.PFE.NUM_KEYPOINTS: 4096 2024-09-10 14:15:20,898 INFO cfg.MODEL.PFE.NUM_OUTPUT_FEATURES: 128 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SAMPLE_METHOD: FPS 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.FEATURES_SOURCE: ['bev', 'x_conv3', 'x_conv4'] 2024-09-10 14:15:20,899 INFO ----------- SA_LAYER ----------- 2024-09-10 14:15:20,899 INFO ----------- raw_points ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.raw_points.MLPS: [[16, 16], [16, 16]] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.raw_points.POOL_RADIUS: [0.4, 0.8] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.raw_points.NSAMPLE: [16, 16] 2024-09-10 14:15:20,899 INFO ----------- x_conv1 ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv1.DOWNSAMPLE_FACTOR: 1 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv1.MLPS: [[16, 16], [16, 16]] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv1.POOL_RADIUS: [0.4, 0.8] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv1.NSAMPLE: [16, 16] 2024-09-10 14:15:20,899 INFO ----------- x_conv2 ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv2.DOWNSAMPLE_FACTOR: 2 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv2.MLPS: [[32, 32], [32, 32]] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv2.POOL_RADIUS: [0.8, 1.2] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv2.NSAMPLE: [16, 32] 2024-09-10 14:15:20,899 INFO ----------- x_conv3 ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv3.DOWNSAMPLE_FACTOR: 4 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv3.MLPS: [[64, 64], [64, 64]] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv3.POOL_RADIUS: [1.2, 2.4] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv3.NSAMPLE: [16, 32] 2024-09-10 14:15:20,899 INFO ----------- x_conv4 ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv4.DOWNSAMPLE_FACTOR: 8 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv4.MLPS: [[64, 64], [64, 64]] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv4.POOL_RADIUS: [2.4, 4.8] 2024-09-10 14:15:20,899 INFO cfg.MODEL.PFE.SA_LAYER.x_conv4.NSAMPLE: [16, 32] 2024-09-10 14:15:20,899 INFO ----------- POINT_HEAD ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.POINT_HEAD.NAME: PointHeadSimple 2024-09-10 14:15:20,899 INFO cfg.MODEL.POINT_HEAD.CLS_FC: [256, 256] 2024-09-10 14:15:20,899 INFO cfg.MODEL.POINT_HEAD.CLASS_AGNOSTIC: True 2024-09-10 14:15:20,899 INFO cfg.MODEL.POINT_HEAD.USE_POINT_FEATURES_BEFORE_FUSION: True 2024-09-10 14:15:20,899 INFO ----------- TARGET_CONFIG ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.POINT_HEAD.TARGET_CONFIG.GT_EXTRA_WIDTH: [0.2, 0.2, 0.2] 2024-09-10 14:15:20,899 INFO ----------- LOSS_CONFIG ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.POINT_HEAD.LOSS_CONFIG.LOSS_REG: smooth-l1 2024-09-10 14:15:20,899 INFO ----------- LOSS_WEIGHTS ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.POINT_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.point_cls_weight: 1.0 2024-09-10 14:15:20,899 INFO ----------- ROI_HEAD ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.NAME: PVRCNNHead 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.CLASS_AGNOSTIC: True 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.SHARED_FC: [256, 256] 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.CLS_FC: [256, 256] 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.REG_FC: [256, 256] 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.DP_RATIO: 0.3 2024-09-10 14:15:20,899 INFO ----------- NMS_CONFIG ----------- 2024-09-10 14:15:20,899 INFO ----------- TRAIN ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TRAIN.NMS_TYPE: nms_gpu 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TRAIN.MULTI_CLASSES_NMS: False 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TRAIN.NMS_PRE_MAXSIZE: 9000 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TRAIN.NMS_POST_MAXSIZE: 512 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TRAIN.NMS_THRESH: 0.8 2024-09-10 14:15:20,899 INFO ----------- TEST ----------- 2024-09-10 14:15:20,899 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TEST.NMS_TYPE: nms_gpu 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TEST.MULTI_CLASSES_NMS: False 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TEST.NMS_PRE_MAXSIZE: 4096 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TEST.NMS_POST_MAXSIZE: 300 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.NMS_CONFIG.TEST.NMS_THRESH: 0.85 2024-09-10 14:15:20,900 INFO ----------- ROI_GRID_POOL ----------- 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.ROI_GRID_POOL.GRID_SIZE: 6 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.ROI_GRID_POOL.MLPS: [[64, 64], [64, 64]] 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.ROI_GRID_POOL.POOL_RADIUS: [0.8, 1.6] 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.ROI_GRID_POOL.NSAMPLE: [16, 16] 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.ROI_GRID_POOL.POOL_METHOD: max_pool 2024-09-10 14:15:20,900 INFO ----------- TARGET_CONFIG ----------- 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.BOX_CODER: ResidualCoder 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.ROI_PER_IMAGE: 128 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.FG_RATIO: 0.5 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.SAMPLE_ROI_BY_EACH_CLASS: True 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.CLS_SCORE_TYPE: roi_iou 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.CLS_FG_THRESH: 0.75 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.CLS_BG_THRESH: 0.25 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.CLS_BG_THRESH_LO: 0.1 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.HARD_BG_RATIO: 0.8 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.TARGET_CONFIG.REG_FG_THRESH: 0.55 2024-09-10 14:15:20,900 INFO ----------- LOSS_CONFIG ----------- 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.LOSS_CONFIG.CLS_LOSS: BinaryCrossEntropy 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.LOSS_CONFIG.REG_LOSS: smooth-l1 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.LOSS_CONFIG.CORNER_LOSS_REGULARIZATION: True 2024-09-10 14:15:20,900 INFO ----------- LOSS_WEIGHTS ----------- 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.rcnn_cls_weight: 1.0 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.rcnn_reg_weight: 1.0 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.rcnn_corner_weight: 1.0 2024-09-10 14:15:20,900 INFO cfg.MODEL.ROI_HEAD.LOSS_CONFIG.LOSS_WEIGHTS.code_weights: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0] 2024-09-10 14:15:20,900 INFO ----------- POST_PROCESSING ----------- 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.RECALL_THRESH_LIST: [0.3, 0.5, 0.7] 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.SCORE_THRESH: 0.1 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.OUTPUT_RAW_SCORE: False 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.EVAL_METRIC: kitti 2024-09-10 14:15:20,900 INFO ----------- NMS_CONFIG ----------- 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.NMS_CONFIG.MULTI_CLASSES_NMS: False 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.NMS_CONFIG.NMS_TYPE: nms_gpu 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.NMS_CONFIG.NMS_THRESH: 0.1 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.NMS_CONFIG.NMS_PRE_MAXSIZE: 4096 2024-09-10 14:15:20,900 INFO cfg.MODEL.POST_PROCESSING.NMS_CONFIG.NMS_POST_MAXSIZE: 500 2024-09-10 14:15:20,900 INFO ----------- OPTIMIZATION ----------- 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.BATCH_SIZE_PER_GPU: 2 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.NUM_EPOCHS: 80 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.OPTIMIZER: adam_onecycle 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.LR: 0.01 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.WEIGHT_DECAY: 0.01 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.MOMENTUM: 0.9 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.MOMS: [0.95, 0.85] 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.PCT_START: 0.4 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.DIV_FACTOR: 10 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.DECAY_STEP_LIST: [35, 45] 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.LR_DECAY: 0.1 2024-09-10 14:15:20,900 INFO cfg.OPTIMIZATION.LR_CLIP: 1e-07 2024-09-10 14:15:20,901 INFO cfg.OPTIMIZATION.LR_WARMUP: False 2024-09-10 14:15:20,901 INFO cfg.OPTIMIZATION.WARMUP_EPOCH: 1 2024-09-10 14:15:20,901 INFO cfg.OPTIMIZATION.GRAD_NORM_CLIP: 10 2024-09-10 14:15:20,901 INFO cfg.TAG: pv_rcnn 2024-09-10 14:15:20,901 INFO cfg.EXP_GROUP_PATH: custom_models 2 My batch size 2024-09-10 14:15:20,903 INFO ----------- Create dataloader & network & optimizer ----------- 2024-09-10 14:15:20,904 INFO Loading Custom dataset. 2024-09-10 14:15:20,904 INFO Total samples for CUSTOM dataset: 18 /home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2894.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] 2024-09-10 14:15:23,546 INFO ----------- Model PVRCNN created, param count: 9105861 ----------- 2024-09-10 14:15:23,546 INFO PVRCNN( (vfe): MeanVFE() (backbone_3d): VoxelBackBone8x( (conv_input): SparseSequential( (0): SubMConv3d(3, 16, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[1, 1, 1], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(16, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (conv1): SparseSequential( (0): SparseSequential( (0): SubMConv3d(16, 16, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(16, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) ) (conv2): SparseSequential( (0): SparseSequential( (0): SparseConv3d(16, 32, kernel_size=[3, 3, 3], stride=[2, 2, 2], padding=[1, 1, 1], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(32, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (1): SparseSequential( (0): SubMConv3d(32, 32, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(32, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (2): SparseSequential( (0): SubMConv3d(32, 32, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(32, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) ) (conv3): SparseSequential( (0): SparseSequential( (0): SparseConv3d(32, 64, kernel_size=[3, 3, 3], stride=[2, 2, 2], padding=[1, 1, 1], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (1): SparseSequential( (0): SubMConv3d(64, 64, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (2): SparseSequential( (0): SubMConv3d(64, 64, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) ) (conv4): SparseSequential( (0): SparseSequential( (0): SparseConv3d(64, 64, kernel_size=[3, 3, 3], stride=[2, 2, 2], padding=[0, 1, 1], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (1): SparseSequential( (0): SubMConv3d(64, 64, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (2): SparseSequential( (0): SubMConv3d(64, 64, kernel_size=[3, 3, 3], stride=[1, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) ) (conv_out): SparseSequential( (0): SparseConv3d(64, 128, kernel_size=[3, 1, 1], stride=[2, 1, 1], padding=[0, 0, 0], dilation=[1, 1, 1], output_padding=[0, 0, 0], bias=False, algo=ConvAlgo.MaskImplicitGemm) (1): BatchNorm1d(128, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) ) (map_to_bev_module): HeightCompression() (pfe): VoxelSetAbstraction( (SA_layers): ModuleList( (0): StackSAModuleMSG( (groupers): ModuleList( (0): QueryAndGroup() (1): QueryAndGroup() ) (mlps): ModuleList( (0): Sequential( (0): Conv2d(67, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU() ) (1): Sequential( (0): Conv2d(67, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU() ) ) ) (1): StackSAModuleMSG( (groupers): ModuleList( (0): QueryAndGroup() (1): QueryAndGroup() ) (mlps): ModuleList( (0): Sequential( (0): Conv2d(67, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU() ) (1): Sequential( (0): Conv2d(67, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU() ) ) ) ) (vsa_point_feature_fusion): Sequential( (0): Linear(in_features=512, out_features=128, bias=False) (1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() ) ) (backbone_2d): BaseBEVBackbone( (blocks): ModuleList( (0): Sequential( (0): ZeroPad2d((1, 1, 1, 1)) (1): Conv2d(256, 64, kernel_size=(3, 3), stride=(1, 1), bias=False) (2): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (3): ReLU() (4): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (5): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (6): ReLU() (7): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (8): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (9): ReLU() (10): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (11): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (12): ReLU() (13): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (14): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (15): ReLU() (16): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (17): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (18): ReLU() ) (1): Sequential( (0): ZeroPad2d((1, 1, 1, 1)) (1): Conv2d(64, 64, kernel_size=(3, 3), stride=(2, 2), bias=False) (2): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (3): ReLU() (4): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (5): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (6): ReLU() (7): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (8): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (9): ReLU() (10): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (11): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (12): ReLU() (13): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (14): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (15): ReLU() (16): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (17): BatchNorm2d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (18): ReLU() ) ) (deblocks): ModuleList( (0): Sequential( (0): ConvTranspose2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(256, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) (1): Sequential( (0): ConvTranspose2d(64, 256, kernel_size=(2, 2), stride=(2, 2), bias=False) (1): BatchNorm2d(256, eps=0.001, momentum=0.01, affine=True, track_running_stats=True) (2): ReLU() ) ) ) (dense_head): AnchorHeadSingle( (cls_loss_func): SigmoidFocalClassificationLoss() (reg_loss_func): WeightedSmoothL1Loss() (dir_loss_func): WeightedCrossEntropyLoss() (conv_cls): Conv2d(512, 8, kernel_size=(1, 1), stride=(1, 1)) (conv_box): Conv2d(512, 28, kernel_size=(1, 1), stride=(1, 1)) (conv_dir_cls): Conv2d(512, 8, kernel_size=(1, 1), stride=(1, 1)) ) (point_head): PointHeadSimple( (cls_loss_func): SigmoidFocalClassificationLoss() (cls_layers): Sequential( (0): Linear(in_features=512, out_features=256, bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Linear(in_features=256, out_features=256, bias=False) (4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU() (6): Linear(in_features=256, out_features=1, bias=True) ) ) (roi_head): PVRCNNHead( (proposal_target_layer): ProposalTargetLayer() (reg_loss_func): WeightedSmoothL1Loss() (roi_grid_pool_layer): StackSAModuleMSG( (groupers): ModuleList( (0): QueryAndGroup() (1): QueryAndGroup() ) (mlps): ModuleList( (0): Sequential( (0): Conv2d(131, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU() ) (1): Sequential( (0): Conv2d(131, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU() ) ) ) (shared_fc_layer): Sequential( (0): Conv1d(27648, 256, kernel_size=(1,), stride=(1,), bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Dropout(p=0.3, inplace=False) (4): Conv1d(256, 256, kernel_size=(1,), stride=(1,), bias=False) (5): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (6): ReLU() ) (cls_layers): Sequential( (0): Conv1d(256, 256, kernel_size=(1,), stride=(1,), bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Dropout(p=0.3, inplace=False) (4): Conv1d(256, 256, kernel_size=(1,), stride=(1,), bias=False) (5): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (6): ReLU() (7): Conv1d(256, 1, kernel_size=(1,), stride=(1,)) ) (reg_layers): Sequential( (0): Conv1d(256, 256, kernel_size=(1,), stride=(1,), bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Dropout(p=0.3, inplace=False) (4): Conv1d(256, 256, kernel_size=(1,), stride=(1,), bias=False) (5): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (6): ReLU() (7): Conv1d(256, 7, kernel_size=(1,), stride=(1,)) ) ) ) 2024-09-10 14:15:23,548 INFO **********************Start training custom_models/pv_rcnn(default)********************** epochs: 0%| | 0/80 [00:00<?, ?it/sCurrent_batch size 2 | 0/9 [00:00<?, ?it/s] ['000000' '000001'] current batch file name 2024-09-10 14:15:25,937 INFO Train: 1/80 ( 1%) [ 0/9 ( 0%)] Loss: 342.2 (342.) LR: 1.000e-03 Time cost: 00:02/00:20 [00:02/26:45] Acc_iter 1 Data time: 0.04(0.04) Forward time: 2.19(2.19) Batch time: 2.23(2.23) Current_batch size 2 ['000002' '000004'] current batch file name Current_batch size 2 ['000005' '000006'] current batch file name Current_batch size 2 ['000008' '000009'] current batch file name Current_batch size 2 ['000011' '000012'] current batch file name Current_batch size 2 ['000013' '000015'] current batch file name Current_batch size 2 ['000016' '000017'] current batch file name Current_batch size 2 ['000018' '000019'] current batch file name Current_batch size 2 ['000020' '000021'] current batch file name 2024-09-10 14:15:30,571 INFO Train: 1/80 ( 1%) [ 8/9 ( 89%)] Loss: 505.2 (801.) LR: 1.017e-03 Time cost: 00:06/00:00 [00:07/09:03] Acc_iter 9 Data time: 0.00(0.01) Forward time: 0.60(0.75) Batch time: 0.60(0.76) epochs: 1%|█▉ | 1/80 [00:07<09:29, 7.20s/itCurrent_batch size 2 | 0/9 [00:00<?, ?it/s] ['000000' '000001'] current batch file name 2024-09-10 14:15:31,543 INFO Train: 2/80 ( 2%) [ 0/9 ( 0%)] Loss: 252.3 (252.) LR: 1.022e-03 Time cost: 00:00/00:06 [00:07/08:21] Acc_iter 10 Data time: 0.08(0.08) Forward time: 0.62(0.62) Batch time: 0.70(0.70) Current_batch size 2 ['000002' '000004'] current batch file name Current_batch size 2 ['000005' '000006'] current batch file name Current_batch size 2 ['000008' '000009'] current batch file name Current_batch size 2 ['000011' '000012'] current batch file name Current_batch size 2 ['000013' '000015'] current batch file name Current_batch size 2 ['000016' '000017'] current batch file name Current_batch size 2 ['000018' '000019'] current batch file name Current_batch size 2 ['000020' '000021'] current batch file name 2024-09-10 14:15:36,277 INFO Train: 2/80 ( 2%) [ 8/9 ( 89%)] Loss: 331.3 (682.) LR: 1.077e-03 Time cost: 00:05/00:00 [00:12/07:04] Acc_iter 18 Data time: 0.00(0.01) Forward time: 0.60(0.59) Batch time: 0.60(0.60) epochs: 2%|███▊ | 2/80 [00:13<08:20, 6.41s/itCurrent_batch size 2 | 0/9 [00:00<?, ?it/s] ['000000' '000001'] current batch file name 2024-09-10 14:15:37,409 INFO Train: 3/80 ( 4%) [ 0/9 ( 0%)] Loss: 99.24 (99.2) LR: 1.086e-03 Time cost: 00:00/00:06 [00:13/08:24] Acc_iter 19 Data time: 0.09(0.09) Forward time: 0.63(0.63) Batch time: 0.72(0.72) Current_batch size 2 ['000002' '000004'] current batch file name Current_batch size 2 ['000005' '000006'] current batch file name Current_batch size 2 ['000008' '000009'] current batch file name Current_batch size 2 ['000011' '000012'] current batch file name Current_batch size 2 ['000013' '000015'] current batch file name Current_batch size 2 ['000016' '000017'] current batch file name Current_batch size 2 ['000018' '000019'] current batch file name Current_batch size 2 ['000020' '000021'] current batch file name 2024-09-10 14:15:42,148 INFO Train: 3/80 ( 4%) [ 8/9 ( 89%)] Loss: 244.1 (804.) LR: 1.180e-03 Time cost: 00:05/00:00 [00:18/07:00] Acc_iter 27 Data time: 0.00(0.01) Forward time: 0.60(0.59) Batch time: 0.60(0.61) epochs: 4%|█████▋ | 3/80 [00:18<07:50, 6.12s/itCurrent_batch size 2 | 0/9 [00:00<?, ?it/s] ['000000' '000001'] current batch file name 2024-09-10 14:15:43,158 INFO Train: 4/80 ( 5%) [ 0/9 ( 0%)] Loss: 487.5 (487.) LR: 1.194e-03 Time cost: 00:00/00:06 [00:19/08:07] Acc_iter 28 Data time: 0.07(0.07) Forward time: 0.63(0.63) Batch time: 0.70(0.70) Current_batch size 2 ['000002' '000004'] current batch file name Current_batch size 2 ['000005' '000006'] current batch file name Current_batch size 2 ['000008' '000009'] current batch file name Current_batch size 2 ['000011' '000012'] current batch file name Current_batch size 2 ['000013' '000015'] current batch file name Current_batch size 2 ['000016' '000017'] current batch file name Current_batch size 2 ['000018' '000019'] current batch file name Current_batch size 2 ['000020' '000021'] current batch file name 2024-09-10 14:15:47,780 INFO Train: 4/80 ( 5%) [ 8/9 ( 89%)] Loss: 478.9 (698.) LR: 1.324e-03 Time cost: 00:05/00:00 [00:24/06:45] Acc_iter 36 Data time: 0.00(0.01) Forward time: 0.50(0.58) Batch time: 0.50(0.59) epochs: 5%|███████▋ | 4/80 [00:24<07:34, 5.98s/itCurrent_batch size 2 | 0/9 [00:00<?, ?it/s] ['000000' '000001'] current batch file name 2024-09-10 14:15:48,960 INFO Train: 5/80 ( 6%) [ 0/9 ( 0%)] Loss: 239.0 (239.) LR: 1.343e-03 Time cost: 00:00/00:06 [00:25/08:16] Acc_iter 37 Data time: 0.08(0.08) Forward time: 0.64(0.64) Batch time: 0.72(0.72) Current_batch size 2 ['000002' '000004'] current batch file name Current_batch size 2 ['000005' '000006'] current batch file name Current_batch size 2 ['000008' '000009'] current batch file name Current_batch size 2 ['000011' '000012'] current batch file name Current_batch size 2 ['000013' '000015'] current batch file name Current_batch size 2 ['000016' '000017'] current batch file name Current_batch size 2 ['000018' '000019'] current batch file name Current_batch size 2 ['000020' '000021'] current batch file name 2024-09-10 14:15:53,661 INFO Train: 5/80 ( 6%) [ 8/9 ( 89%)] Loss: 312.4 (641.) LR: 1.508e-03 Time cost: 00:05/00:00 [00:30/06:47] Acc_iter 45 Data time: 0.00(0.01) Forward time: 0.60(0.59) Batch time: 0.61(0.60) epochs: 6%|█████████▌ | 5/80 [00:30<07:24, 5.93s/itCurrent_batch size 2 | 0/9 [00:00<?, ?it/s] ['000000' '000001'] current batch file name 2024-09-10 14:15:54,762 INFO Train: 6/80 ( 8%) [ 0/9 ( 0%)] Loss: 311.6 (312.) LR: 1.531e-03 Time cost: 00:00/00:06 [00:31/07:50] Acc_iter 46 Data time: 0.08(0.08) Forward time: 0.62(0.62) Batch time: 0.70(0.70) Current_batch size 2 ['000002' '000004'] current batch file name Current_batch size 2 ['000005' '000006'] current batch file name Current_batch size 2 ['000008' '000009'] current batch file name Current_batch size 2 ['000011' '000012'] current batch file name 2024-09-10 14:15:57,071 INFO Train: 6/80 ( 8%) [ 4/9 ( 44%)] Loss: 925.3 (666.) LR: 1.628e-03 Time cost: 00:03/00:03 [00:33/06:43] Acc_iter 50 Data time: 0.00(0.02) Forward time: 0.59(0.58) Batch time: 0.60(0.60) Current_batch size 2 ['000013' '000015'] current batch file name Current_batch size 2 ['000016' '000017'] current batch file name Current_batch size 2 ['000018' '000019'] current batch file name Current_batch size 2 ['000020' '000021'] current batch file name 2024-09-10 14:15:59,269 INFO Train: 6/80 ( 8%) [ 8/9 ( 89%)] Loss: 0.1672 (677.) LR: 1.731e-03 Time cost: 00:05/00:00 [00:35/06:25] Acc_iter 54 Data time: 0.00(0.01) Forward time: 0.50(0.57) Batch time: 0.50(0.58) epochs: 8%|███████████▍ | 6/80 [00:36<07:11, 5.83s/itCurrent_batch size 2 | 0/9 [00:00<?, ?it/s] ['000000' '000001'] current batch file name 2024-09-10 14:16:00,407 INFO Train: 7/80 ( 9%) [ 0/9 ( 0%)] Loss: 303.7 (304.) LR: 1.758e-03 Time cost: 00:00/00:06 [00:36/07:51] Acc_iter 55 Data time: 0.08(0.08) Forward time: 0.62(0.62) Batch time: 0.71(0.71) Current_batch size 2 ['000002' '000004'] current batch file name Current_batch size 2 ['000005' '000006'] current batch file name Current_batch size 2 ['000008' '000009'] current batch file name Current_batch size 2 ['000011' '000012'] current batch file name Current_batch size 2 ['000013' '000015'] current batch file name Current_batch size 2 ['000016' '000017'] current batch file name Current_batch size 2 ['000018' '000019'] current batch file name Current_batch size 2 ['000020' '000021'] current batch file name epochs: 8%|███████████▍ | 6/80 [00:40<08:24, 6.82s/it] Traceback (most recent call last): File "/home/ubuntu/v-detr/OpenPCDet/tools/train.py", line 234, in <module> main() File "/home/ubuntu/v-detr/OpenPCDet/tools/train.py", line 178, in main train_model( File "/home/ubuntu/v-detr/OpenPCDet/tools/train_utils/train_utils.py", line 189, in train_model accumulated_iter = train_one_epoch( File "/home/ubuntu/v-detr/OpenPCDet/tools/train_utils/train_utils.py", line 65, in train_one_epoch loss, tb_dict, disp_dict = model_func(model, batch) File "/home/ubuntu/v-detr/OpenPCDet/tools/../pcdet/models/__init__.py", line 46, in model_func ret_dict, tb_dict, disp_dict = model(batch_dict) File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/ubuntu/v-detr/OpenPCDet/tools/../pcdet/models/detectors/pv_rcnn.py", line 19, in forward batch_dict = cur_module(batch_dict) File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/ubuntu/v-detr/OpenPCDet/tools/../pcdet/models/backbones_3d/spconv_backbone.py", line 159, in forward x = self.conv_input(input_sp_tensor) File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/spconv/pytorch/modules.py", line 142, in forward input = input.replace_feature(module(input.features)) File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/nn/modules/batchnorm.py", line 168, in forward return F.batch_norm( File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/nn/functional.py", line 2436, in batch_norm _verify_batch_size(input.size()) File "/home/ubuntu/miniconda3/envs/pcd/lib/python3.9/site-packages/torch/nn/functional.py", line 2404, in _verify_batch_size raise ValueError("Expected more than 1 value per channel when training, got input size {}".format(size)) ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 16])
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
@Serzhanov did you ever manage to fix this? I am experiencing the same issue