Xiuwen Chen, Li Fang, Long Ye, Qin Zhang. Deep Video Harmonization by Improving Spatial-temporal Consistency[J]. Machine Intelligence Research, 2024, 21(1): 46-54. DOI: 10.1007/s11633-023-1447-3
Citation: Xiuwen Chen, Li Fang, Long Ye, Qin Zhang. Deep Video Harmonization by Improving Spatial-temporal Consistency[J]. Machine Intelligence Research, 2024, 21(1): 46-54. DOI: 10.1007/s11633-023-1447-3

Deep Video Harmonization by Improving Spatial-temporal Consistency

  • Video harmonization is an important step in video editing to achieve visual consistency by adjusting foreground appearances in both spatial and temporal dimensions. Previous methods always only harmonize on a single scale or ignore the inaccuracy of flow estimation, which leads to limited harmonization performance. In this work, we propose a novel architecture for video harmonization by making full use of spatiotemporal features and yield temporally consistent harmonized results. We introduce multiscale harmonization by using nonlocal similarity on each scale to make the foreground more consistent with the background. We also propose a foreground temporal aggregator to dynamically aggregate neighboring frames at the feature level to alleviate the effect of inaccurate estimated flow and ensure temporal consistency. The experimental results demonstrate the superiority of our method over other state-of-the-art methods in both quantitative and visual comparisons.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return