Small Target Detection Combining Regional Stability and Saliency in a Color Image

Jing Lou, Wei Zhu, Huan Wang, Mingwu Ren

Nanjing University of Science and Technology

Small Target Detection Combining Regional Stability and Saliency in a Color Image - Framework

Fig. 1 Framework of the proposed RSS model

Abstract

In this paper, we will address the issue of detecting small target in a color image from the perspectives of both stability and saliency. First, we consider small target detection as a stable region extraction problem. Several stability criteria are applied to generate a stability map, which involves a set of locally stable regions derived from sequential boolean maps. Second, considering the local contrast of a small target and its surroundings, we obtain a saliency map by comparing the color vector of each pixel with its Gaussian blurred version. Finally, both the stability and saliency maps are integrated in a pixel-wise multiplication manner for removing false alarms. In addition, we introduce a set of integration models by combining several existing stability and saliency methods, and use them to indicate the validity of the proposed framework. Experimental results show that our model adapts to target size variations and performs favorably in terms of precision, recall and F-measure on three challenging datasets.

Paper

Jing Lou, Wei Zhu, Huan Wang, Mingwu Ren*, “Small Target Detection Combining Regional Stability and Saliency in a Color Image,” Multimedia Tools and Applications, vol. 76, no. 13, pp. 14781–14798, 2017. doi:10.1007/s11042-016-4025-7
PDF Bib MATLAB Code Slides (in Chinese)

For MSER [18], we directly extract co-variant regions by utilizing the VL_MSER function of the VLFeat open source library [21]. In the above paper, we discussed the main parameters of the VL_MSER function and presented the statistical metrics and results. The MATLAB code can be downloaded from GitHub or Baidu Cloud.

We provide a benchmark database for small target detection evaluation. This database contains three datasets (totally 1,093 color images), and provides pixel-level ground truth mask for each color image. Some statistics and features of these datasets are summarized in the following table. We also provide the developed MATLAB code for evaluating detection results on this database. The code and data can be downloaded from GitHub or Baidu Cloud.

No.	Background	Images	Features
1	Sky	805	Frames #001~#752: Single target; Frames #753~#805: No target
2	Sea-Sky	208	Single target
3^a	Ground	80	Single target
^a Dataset 3 is available at http://people.ee.ethz.ch/~dragonr/943/ [8], in which each image is shrunk to 20% of the original image size for the purpose of demonstration in our experiments

Demos

It should be noted that although each resulting demo is shown in the form of video, all video files are created from the sequential frames in which each frame is obtained by applying the proposed RSS model to the individual image.

(upper left) – input; (upper right) – result of the proposed RSS model; (lower left) – stability map; (lower right) – saliency map.
Demo 1 (805 frames) Download	Demo 3 (80 frames) Download
Demo 2 (208 frames) Download

Acknowledgments

The authors thank all of the anonymous reviewers for their insights and suggestions, which were very helpful in improving this manuscript. They thank Haiyang Zhang for useful discussions. They also thank Mei Zhang and Huaiping Zhang for their kind proofreading of the manuscript. This work is supported by the National Natural Science Foundation of China under Grant 61231014.

References

Achanta R, Hemami S, Estrada F, Süsstrunk S (2009) Frequency-tuned salient region detection. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 1597–1604

Borji A, Cheng M-M, Jiang H, Li J (2015) Salient object detection: a benchmark. IEEE Trans Image Process 24(12): 5706–5722

Dragon R, Ostermann J, Van Gool L (2013) Robust realtime motion-split-and-merge for motion segmentation. In Proc. Ger. Conf. Pattern Recognit., 425–434

18.

Matas J, Chum O, Urban M, Pajdla T (2004) Robust wide-baseline stereo from maximally stable extremal regions. Image Vis Comput 22(10): 761–767

19.

Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1): 62–66

21.

Vedaldi A, Fulkerson B (2008) VLFeat: An open and portable library of computer vision algorithms, version 0.9.19. http://www.vlfeat.org.

Latest update: Jun 8, 2017

Jing Lou (楼竞)

School of Information Engineering, Changzhou Vocational Institute of Mechatronic Technology, Changzhou 213164, Jiangsu, China

Related Posts