Zhaozhong Wang, Dian Shao, Lei Zhang, Zuowei Zhang, Binglu Wang. SAMDistill: SAM-based Spatial-temporal Distillation for Robust 3D Object DetectionJ. Machine Intelligence Research. DOI: 10.1007/s11633-025-1586-9
Citation: Zhaozhong Wang, Dian Shao, Lei Zhang, Zuowei Zhang, Binglu Wang. SAMDistill: SAM-based Spatial-temporal Distillation for Robust 3D Object DetectionJ. Machine Intelligence Research. DOI: 10.1007/s11633-025-1586-9

SAMDistill: SAM-based Spatial-temporal Distillation for Robust 3D Object Detection

  • Multicamera 3D object detection has emerged as a research focus due to its cost-effectiveness. Recent methods perform well on clean datasets but fail in complex environments where adverse weather induces detection challenges (low foreground-background contrast, occlusion) that are mirrored in camouflage scenarios. Owing to training on large-scale datasets, the segment anything model (SAM) has strong generalizability and robustness but lacks the ability to capture spatial structure and depth information which are critical in 3D tasks. To address this issue, we propose SAMDistill, a distillation framework which uses a pretrained LiDAR detector as a teacher and three carefully designed distillation losses: 1) Spatial-temporal feature alignment ensures geometric consistency in static scenes while propagating cross-frame semantic context. 2) We extend relation distillation to multiscale layers. 3) The instance distillation loss which combines the regression distillation and classification distillation guides the model to focus on areas that are difficult to learn. Experiments show that our method achieves state-of-the-art performance on nuScenes and the noisy dataset nuScenes-C, and demonstrate the generalization across multiple teacher-student configurations.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return