Deep Gradient Learning for Efficient Camouflaged Object Detection

This paper introduces deep gradient network (DGNet), a novel deep framework that exploits object gradient supervision for camouflaged object detection (COD). It decouples the task into two connected branches, i.e., a context and a texture encoder. The essential connection is the gradient-induced transition, representing a soft grouping between context and texture features. Benefiting from the simple but efficient framework, DGNet outperforms existing state-of-the-art COD models by a large margin. Notably, our efficient version, DGNet-S, runs in real-time (80fps) and achieves comparable results to the cutting-edge model JCSOD-CVPR21 with only 6.82% parameters. The application results also show that the proposed DGNet performs well in the polyp segmentation, defect detection, and transparent object segmentation tasks. The code will be made available at



Camouflaged object detection (COD) aims to segment objects with either artificial or natural patterns where objects “perfectly” blend into the background to avoid being discovered. Several successful applications, such as medical image analysis (e.g., polyp and lung infection segmentation), video analysis (e.g., motion segmentation, surveillance, and autonomous driving), and recreational art, have shown COD′s scientific and practical value.


Recent studies present compelling results based on the supervision of the whole object-level ground-truth mask. Later, various sophisticated techniques, e.g., boundary-based and uncertainty-guided, were developed to augment COD′s underlying representations. However, features learned from boundary-supervised or uncertainty-based models usually respond to the sparse edge of camouflage objects, thereby introducing noisy features, especially for complex scenes (see Fig. 1(a)). Besides, the boundaries of camouflaged objects are always “indefinable” or “fuzzy”; thus, they do not be pop-out from a quick visual scanning. We notice that despite the object′s camouflage, there are still some clues left, shown in the first column of Fig. 1 (white speckles). Instead of extracting only boundary or uncertainty regions, we are interested in how the network mines these “discriminative patterns” inside the object.

Fig. 1 Feature visualization of learned texture. We observe that the proposed DGNet-S under the object boundary supervision (a) contains diffused noises in the background. By contrast, object gradient supervision (b) enforces the network focus on the regions where the intensity changes dramatically. 

From this perspective, we present our deep gradient network (DGNet) via the explicit supervision of the object-level gradient map. The underlying hypothesis is that there are some intensity changes inside the camouflaged objects. To ease the learning task, we decouple the DGNet into two connected branches, i.e., a context and a texture encoder. The former can be viewed as a contextual semantics learner, while the latter acts as a structural texture extractor. In this way, we can alleviate the feature ambiguity between the high-level and low-level features extracted from the individual branch. To sufficiently aggregate the above two discriminative features generated by the two branches, we further design a gradient-induced transition (GIT) module that collaboratively ensembles the multi-source feature space at different group scales (i.e., soft grouping). Fig. 1(b) shows that our DGNet can detect texture patterns while suppressing the background noise by an intensity-sensitive strategy focusing on the intra-region of a camouflaged object.


Extensive experiments on three challenging COD benchmarks illustrate that the proposed DGNet achieves state-of-the-art (SOTA) performance without introducing any complicated structures. Furthermore, we implement an efficient version, DGNet-S, with 8.3 M parameters, which achieves the fastest inference speed (80 fps) among COD-related baselines. Notably, it only has 6.82% parameters compared to the cutting-edge model JCSOD-CVPR21 while achieving comparable performance. These results show that DGNet significantly narrows the gap between scientific research and practical application. Three downstream applications (see Section 5) of our DGNet also support this conclusion. The major contributions of this paper are summarized as follows:


1) We introduce a novel deep gradient-based framework, dubbed DGNet, for addressing the camouflaged object detection task.

2) We propose a gradient-induced transition to automatically group features from the context and texture branches according to the soft grouping strategy.

3) We present three applications and achieve good performance, including polyp segmentation, defect detection, and transparent object segmentation.


Download full text:

Deep Gradient Learning for Efficient Camouflaged Object Detection

Ge-Peng Ji, Deng-Ping Fan, Yu-Cheng Chou, Dengxin Dai, Alexander Liniger, Luc Van Gool

  • Share:
Release Date: 2023-03-15 Visited: