Small Target Detection Combining Regional Stability and Saliency in a Color Image
Jing Lou,   Wei Zhu,   Huan Wang,   Mingwu Ren
Nanjing University of Science and Technology
Small Target Detection Combining Regional Stability and Saliency in a Color Image - Framework

Fig. 1   Framework of the proposed RSS model

In this paper, we will address the issue of detecting small target in a color image from the perspectives of both stability and saliency. First, we consider small target detection as a stable region extraction problem. Several stability criteria are applied to generate a stability map, which involves a set of locally stable regions derived from sequential boolean maps. Second, considering the local contrast of a small target and its surroundings, we obtain a saliency map by comparing the color vector of each pixel with its Gaussian blurred version. Finally, both the stability and saliency maps are integrated in a pixel-wise multiplication manner for removing false alarms. In addition, we introduce a set of integration models by combining several existing stability and saliency methods, and use them to indicate the validity of the proposed framework. Experimental results show that our model adapts to target size variations and performs favorably in terms of precision, recall and F-measure on three challenging datasets.

  • Jing Lou, Wei Zhu, Huan Wang, Mingwu Ren*, “Small Target Detection Combining Regional Stability and Saliency in a Color Image,” Multimedia Tools and Applications, vol. 76, no. 13, pp. 14781–14798, 2017.  doi:10.1007/s11042-016-4025-7
    PDF Bib MATLAB Code Slides (in Chinese)
  • For MSER [18], we directly extract co-variant regions by utilizing the VL_MSER function of the VLFeat open source library [21]. In the above paper, we discussed the main parameters of the VL_MSER function and presented the statistical metrics and results. The MATLAB code can be downloaded from GitHub or Baidu Cloud.
  • We provide a benchmark database for small target detection evaluation. This database contains three datasets (totally 1,093 color images), and provides pixel-level ground truth mask for each color image. Some statistics and features of these datasets are summarized in the following table. We also provide the developed MATLAB code for evaluating detection results on this database. The code and data can be downloaded from GitHub or Baidu Cloud.

    a Dataset 3 is available at [8], in which each image is shrunk to 20% of the original image size for the purpose of demonstration in our experiments
    1Sky805Frames #001~#752: Single target; Frames #753~#805: No target
    2Sea-Sky208Single target
    3aGround80Single target
  • It should be noted that although each resulting demo is shown in the form of video, all video files are created from the sequential frames in which each frame is obtained by applying the proposed RSS model to the individual image.
(upper left) – input;     (upper right) – result of the proposed RSS model;     (lower left) – stability map;     (lower right) – saliency map.
Demo 1  (805 frames)  Download
Demo 3  (80 frames)  Download
Demo 2  (208 frames)  Download

The authors thank all of the anonymous reviewers for their insights and suggestions, which were very helpful in improving this manuscript. They thank Haiyang Zhang for useful discussions. They also thank Mei Zhang and Huaiping Zhang for their kind proofreading of the manuscript. This work is supported by the National Natural Science Foundation of China under Grant 61231014.

Achanta R, Hemami S, Estrada F, Süsstrunk S (2009) Frequency-tuned salient region detection. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 1597–1604
Borji A, Cheng M-M, Jiang H, Li J (2015) Salient object detection: a benchmark. IEEE Trans Image Process 24(12): 5706–5722
Dragon R, Ostermann J, Van Gool L (2013) Robust realtime motion-split-and-merge for motion segmentation. In Proc. Ger. Conf. Pattern Recognit., 425–434
Matas J, Chum O, Urban M, Pajdla T (2004) Robust wide-baseline stereo from maximally stable extremal regions. Image Vis Comput 22(10): 761–767
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1): 62–66
Vedaldi A, Fulkerson B (2008) VLFeat: An open and portable library of computer vision algorithms, version 0.9.19.
Latest update:  Jun 8, 2017