Skip to content

about pv_rcnn dim shape can't match on training my kitti dataset. #1746

@PowerBearMeng

Description

@PowerBearMeng

i should get 768 but i get 640 so i fail. anyone can help me. i am training pv_rcnn model in kitti dataset.

DEBUG VSA branch 0 shape: torch.Size([4096, 384])
DEBUG VSA branch 1 shape: torch.Size([4096, 32])
DEBUG VSA branch 2 shape: torch.Size([4096, 32])
DEBUG VSA branch 3 shape: torch.Size([4096, 64])
DEBUG VSA branch 4 shape: torch.Size([4096, 128])
DEBUG VSA branch 5 shape: torch.Size([4096, 128])
DEBUG VSA sum channels: 768
epochs: 0%| | 0/80 [00:01<?, ?it/s]

Traceback (most recent call last):
File "train.py", line 233, in
main()
File "train.py", line 178, in main
train_model(
File "/home/yty/mfh/code/OpenPCDet/tools/train_utils/train_utils.py", line 180, in train_model
accumulated_iter = train_one_epoch(
File "/home/yty/mfh/code/OpenPCDet/tools/train_utils/train_utils.py", line 56, in train_one_epoch
loss, tb_dict, disp_dict = model_func(model, batch)
File "/home/yty/mfh/code/OpenPCDet/tools/../pcdet/models/init.py", line 44, in model_func
ret_dict, tb_dict, disp_dict = model(batch_dict)
File "/home/yty/miniconda3/envs/pcdet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yty/mfh/code/OpenPCDet/tools/../pcdet/models/detectors/pv_rcnn.py", line 11, in forward
batch_dict = cur_module(batch_dict)
File "/home/yty/miniconda3/envs/pcdet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yty/mfh/code/OpenPCDet/tools/../pcdet/models/backbones_3d/pfe/voxel_set_abstraction.py", line 409, in forward
point_features = self.vsa_point_feature_fusion(point_features.view(-1, point_features.shape[-1]))
File "/home/yty/miniconda3/envs/pcdet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yty/miniconda3/envs/pcdet/lib/python3.8/site-packages/torch/nn/modules/container.py", line 204, in forward
input = module(input)
File "/home/yty/miniconda3/envs/pcdet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yty/miniconda3/envs/pcdet/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (4096x768 and 640x128)
my pv_rcnn.yaml
CLASS_NAMES: ['Car', 'Bicycle', 'Pedestrian']

DATA_CONFIG:
BASE_CONFIG: cfgs/dataset_configs/custom_pvrcnn.yaml

MODEL:
NAME: PVRCNN
VFE:
NAME: MeanVFE

BACKBONE_3D:
    NAME: VoxelBackBone8x

MAP_TO_BEV:
    NAME: HeightCompression
    NUM_BEV_FEATURES: 256

BACKBONE_2D:
    NAME: BaseBEVBackbone

    LAYER_NUMS: [5, 5]
    LAYER_STRIDES: [1, 2]
    NUM_FILTERS: [128, 256]
    UPSAMPLE_STRIDES: [1, 2]
    NUM_UPSAMPLE_FILTERS: [256, 256]

DENSE_HEAD:
    NAME: AnchorHeadSingle
    CLASS_AGNOSTIC: False

    USE_DIRECTION_CLASSIFIER: True
    DIR_OFFSET: 0.78539
    DIR_LIMIT_OFFSET: 0.0
    NUM_DIR_BINS: 2

    ANCHOR_GENERATOR_CONFIG: [
        {
            'class_name': 'Car',
            'anchor_sizes': [[4.49, 1.80, 1.54]],
            'anchor_rotations': [0, 1.57],
            'anchor_bottom_heights': [-1.71],
            'align_center': False,
            'feature_map_stride': 8,
            'matched_threshold': 0.55,
            'unmatched_threshold': 0.4
        },
        {
            'class_name': 'Pedestrian',
            'anchor_sizes': [[0.47, 0.60, 1.58]],
            'anchor_rotations': [0, 1.57],
            'anchor_bottom_heights': [-1.82],
            'align_center': False,
            'feature_map_stride': 8,
            'matched_threshold': 0.5,
            'unmatched_threshold': 0.35
        },
        {
            'class_name': 'Bicycle',
            'anchor_sizes': [[1.39, 0.70, 1.46]],
            'anchor_rotations': [0, 1.57],
            'anchor_bottom_heights': [-1.95],
            'align_center': False,
            'feature_map_stride': 8,
            'matched_threshold': 0.5,
            'unmatched_threshold': 0.35
        }
    ]

    TARGET_ASSIGNER_CONFIG:
        NAME: AxisAlignedTargetAssigner
        POS_FRACTION: -1.0
        SAMPLE_SIZE: 512
        NORM_BY_NUM_EXAMPLES: False
        MATCH_HEIGHT: False
        BOX_CODER: ResidualCoder

    LOSS_CONFIG:
        LOSS_WEIGHTS: {
            'cls_weight': 1.0,
            'loc_weight': 2.0,
            'dir_weight': 0.1,
            'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
        }

PFE:
    NAME: VoxelSetAbstraction
    POINT_SOURCE: raw_points
    NUM_KEYPOINTS: 2048
    NUM_OUTPUT_FEATURES: 128
    SAMPLE_METHOD: FPS

    FEATURES_SOURCE: ['bev', 'x_conv1', 'x_conv2', 'x_conv3', 'x_conv4', 'raw_points']
    SA_LAYER:
        raw_points:
            MLPS: [[16, 16], [16, 16]]
            POOL_RADIUS: [0.4, 0.8]
            NSAMPLE: [16, 16]
        x_conv1:
            DOWNSAMPLE_FACTOR: 1
            MLPS: [[16, 16], [16, 16]]
            POOL_RADIUS: [0.4, 0.8]
            NSAMPLE: [16, 16]
        x_conv2:
            DOWNSAMPLE_FACTOR: 2
            MLPS: [[32, 32], [32, 32]]
            POOL_RADIUS: [0.8, 1.2]
            NSAMPLE: [16, 32]
        x_conv3:
            DOWNSAMPLE_FACTOR: 4
            MLPS: [[64, 64], [64, 64]]
            POOL_RADIUS: [1.2, 2.4]
            NSAMPLE: [16, 32]
        x_conv4:
            DOWNSAMPLE_FACTOR: 8
            MLPS: [[64, 64], [64, 64]]
            POOL_RADIUS: [2.4, 4.8]
            NSAMPLE: [16, 32]

POINT_HEAD:
    NAME: PointHeadSimple
    CLS_FC: [256, 256]
    CLASS_AGNOSTIC: True
    USE_POINT_FEATURES_BEFORE_FUSION: True
    TARGET_CONFIG:
        GT_EXTRA_WIDTH: [0.2, 0.2, 0.2]
    LOSS_CONFIG:
        LOSS_REG: smooth-l1
        LOSS_WEIGHTS: {
            'point_cls_weight': 1.0,
        }

ROI_HEAD:
    NAME: PVRCNNHead
    CLASS_AGNOSTIC: True

    SHARED_FC: [256, 256]
    CLS_FC: [256, 256]
    REG_FC: [256, 256]
    DP_RATIO: 0.3

    NMS_CONFIG:
        TRAIN:
            NMS_TYPE: nms_gpu
            MULTI_CLASSES_NMS: False
            NMS_PRE_MAXSIZE: 9000
            NMS_POST_MAXSIZE: 512
            NMS_THRESH: 0.8
        TEST:
            NMS_TYPE: nms_gpu
            MULTI_CLASSES_NMS: False

NMS_PRE_MAXSIZE: 1024

NMS_POST_MAXSIZE: 100

NMS_THRESH: 0.7

            NMS_PRE_MAXSIZE: 4096
            NMS_POST_MAXSIZE: 300
            NMS_THRESH: 0.7


    ROI_GRID_POOL:
        GRID_SIZE: 6
        MLPS: [[64, 64], [64, 64]]
        POOL_RADIUS: [0.8, 1.6]
        NSAMPLE: [16, 16]
        POOL_METHOD: max_pool

    TARGET_CONFIG:
        BOX_CODER: ResidualCoder
        ROI_PER_IMAGE: 128
        FG_RATIO: 0.5

        SAMPLE_ROI_BY_EACH_CLASS: True
        CLS_SCORE_TYPE: roi_iou

        CLS_FG_THRESH: 0.75
        CLS_BG_THRESH: 0.25
        CLS_BG_THRESH_LO: 0.1
        HARD_BG_RATIO: 0.8

        REG_FG_THRESH: 0.55

    LOSS_CONFIG:
        CLS_LOSS: BinaryCrossEntropy
        REG_LOSS: smooth-l1
        CORNER_LOSS_REGULARIZATION: True
        LOSS_WEIGHTS: {
            'rcnn_cls_weight': 1.0,
            'rcnn_reg_weight': 1.0,
            'rcnn_corner_weight': 1.2,
            'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
        }

POST_PROCESSING:
    RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
    SCORE_THRESH: 0.1
    OUTPUT_RAW_SCORE: False

    EVAL_METRIC: kitti

    NMS_CONFIG:
        MULTI_CLASSES_NMS: False
        NMS_TYPE: nms_gpu
        NMS_THRESH: 0.1
        NMS_PRE_MAXSIZE: 4096
        NMS_POST_MAXSIZE: 500

OPTIMIZATION:
BATCH_SIZE_PER_GPU: 2
NUM_EPOCHS: 80

OPTIMIZER: adam_onecycle
LR: 0.01
WEIGHT_DECAY: 0.01
MOMENTUM: 0.9

MOMS: [0.95, 0.85]
PCT_START: 0.4
DIV_FACTOR: 10
DECAY_STEP_LIST: [35, 45]
LR_DECAY: 0.1
LR_CLIP: 0.0000001

LR_WARMUP: False
WARMUP_EPOCH: 1

GRAD_NORM_CLIP: 10

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions