Saturday, August 31, 2024

Moving Target Shadow Detection Method Based on Improved ViBe in VideoSAR Images

 

Fig. 19.Background reconstruction. (a) Original image.
(b) Background reconstructed image.

Moving Target Shadow Detection Method Based on Improved ViBe in VideoSAR Images

Z. Wu, H. Xie, T. Gao, Y. Zhang and H. Liu, "Moving Target Shadow Detection Method Based on Improved ViBe in VideoSAR Images," in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 17, pp. 14575-14587, 2024, doi: 10.1109/JSTARS.2024.3443080.

Abstract: The video synthetic aperture radar (VideoSAR) has the high frame-rate imaging capability, which can perform the ground moving target detection. The moving target often leaves the shadow in the sequence of the VideoSAR images, which can indirectly detect the moving target by detecting its shadow. 

In view of the high false alarm rate and high missed detection rate of existing algorithms, this article proposes a moving target shadow detection method based on the improved ViBe in the VideoSAR images. First, the shadow information and local contrast information of the VideoSAR image are extracted by using the super-pixel segmentation technology, which can achieve the background reconstruction. Besides, the ViBe algorithm is improved by modifying the definition of the distance in the image pixel classification, adding the image pixel screening, and performing the morphological filtering, which is more suitable for the moving target shadow detection in VideoSAR images. Finally, by jointly extracting the regions of interest from multiple frames' detection results, most of the false alarms are suppressed, which can obtain the excellent performance in detecting the moving target shadow. 

The experimental results tested on the VideoSAR data released by the Sandia National Laboratories are shown to validate the correctness and effectiveness of the proposed algorithm, for which the detection rate of moving target shadows reached 97.66%, with only 8 false alarms.

keywords: {Image reconstruction;Image segmentation;Filtering;Radar polarimetry;Object detection;Remote sensing;Radar imaging;Background reconstruction;improved ViBe;moving target shadow detection;video synthetic aperture radar (VideoSAR) images},

URL: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10636022&isnumber=10330207

Authors

Based on the information provided in the paper, here are the details about the authors, their institutional associations, and some related prior work:

Authors and Institutional Associations:

1. Zhitao Wu
   - Affiliation: School of Electronics and Communication Engineering, Shenzhen Campus of Sun Yat-Sen University, Shenzhen, China
   - Currently working towards an M.S. degree in information and communication engineering

2. Hongtu Xie (Member, IEEE)
   - Affiliation: School of Electronics and Communication Engineering, Sun Yat-Sen University (SYSU), Guangzhou, China
   - Position: Associate Professor and M.S. Supervisor
   - Education: Ph.D. in information and communication engineering from the National University of Defense Technology, Changsha, China

3. Ting Gao
   - Affiliation: Air Force Early Warning Academy, Wuhan, China
   - Position: Lecturer and Doctor
   - Education: Ph.D. in image processing from the Air Force Early Warning Academy

4. Yuanjie Zhang
   - Affiliation: School of Electronics and Communication Engineering, Sun Yat-Sen University, Guangzhou, China
   - Currently working towards an M.S. degree in information and communication engineering

5. Haozong Liu
   - Affiliation: Sun Yat-Sen University, Shenzhen, China
   - Currently working towards an M.S. degree in electronic information

Related Prior Work:

1. Hongtu Xie has been involved in several related studies:
   - Work on information extraction and three-dimensional contour reconstruction of vehicle targets using circular synthetic aperture radar data
   - Research on two-level feature-fusion ship recognition strategies combining HOG features with dual-polarized SAR image data
   - Improvements to the NLCS algorithm for oceanic scene imaging using geosynchronous spaceborne-airborne UHF UWB bistatic SAR

2. Ting Gao has experience in intelligent target recognition and has authored or co-authored more than 20 professional papers and 3 monographs.

3. The authors cite several related works in the field of moving target shadow detection in VideoSAR images, including:
   - Research by Raynal et al. from Sandia National Laboratories on shadow characteristics in synthetic aperture radar
   - Work by Tian et al. on simultaneous detection and tracking of moving-target shadows in ViSAR imagery
   - Studies by Ding, Zhang, and others on using deep learning approaches for moving target detection in VideoSAR

The paper builds upon these prior works and introduces improvements to the ViBe algorithm specifically for moving target shadow detection in VideoSAR imagery.

Summary

Here is a summary of the key points from the research article:

1.       The paper proposes an improved method for detecting moving target shadows in VideoSAR (Synthetic Aperture Radar) imagery using an enhanced version of the ViBe (Visual Background Extractor) algorithm.

2.       Key steps in the method include:

a.       Background reconstruction using super-pixel segmentation to extract shadow and local contrast information

b.       Removal of stationary target shadows

c.       Filling of moving target shadow pixels

d.       Initialization of the ViBe background model using the reconstructed background

e.       Improvements to the original ViBe algorithm, including modified distance calculations and additional filtering steps

f.        False alarm suppression using multi-frame joint detection results

3.       Experiments were conducted on 100 frames of VideoSAR data from Sandia National Laboratories.

4.       Results showed the improved method achieved:

a.       97.66% detection rate for moving target shadows

b.       Only 8 false alarms across 100 frames

c.       Better performance compared to the original ViBe algorithm and background subtraction methods

5.       The authors conclude their proposed method is effective for detecting moving target shadows in VideoSAR imagery, with significant reductions in missed detections and false alarms compared to other approaches.

The paper presents a novel technique combining super-pixel segmentation, background modeling, and multi-frame analysis to improve moving target detection in radar imagery. 

Figures and Tables

Here's a list of the figures and tables mentioned in the article, along with explanations of what they are intended to show:

Figures:

1: Flowchart of the background reconstruction- Shows the overall process of background reconstruction

2: Binarization and morphological filtering- Demonstrates the steps of image binarization and morphological filtering on a SAR image

3: Super-pixel segmentation- Illustrates the result of super-pixel segmentation on a SAR image

4: Schematic of the shadow detection window- Explains the structure of the shadow detection window used in the algorithm

5: Flowchart for filtering ROIs based on the shadow information- Outlines the process of filtering Regions of Interest (ROIs) using shadow information

6: Stationary and moving targets and their shadows- Compares the appearance of stationary and moving targets and their shadows in SAR images

7: Traditional local contrast detection window- Shows the structure of a traditional local contrast detection window

8: Local contrast detection window based on the super-pixel- Illustrates the improved local contrast detection window using super-pixel segmentation

9: Flowchart for filtering ROIs using the obtained local contrast information- Outlines the process of filtering ROIs based on local contrast information

10: Schematic of shadow filling- Demonstrates the process of filling in shadow regions

11: Flowchart of the improved ViBe algorithm- Shows the overall process of the improved ViBe algorithm

12: Diagram of the background modeling in the improved ViBe algorithm- Illustrates the background modeling process in the improved ViBe algorithm

13: Schematic of the pixel classification process- Explains the pixel classification process in the ViBe algorithm

14: Schematic of the pixel classification using two distances- Compares the pixel classification process using different distance calculations

15: Changes in the intensity of the imaging scene- Shows how the intensity of the SAR imaging scene can change over time

16: Suppressing false alarms outside the ROIs- Demonstrates the process of suppressing false alarms outside the Regions of Interest

17: Result of the shadow detection- Shows the results of each step in the shadow detection process

18: Result after removing the shadows of stationary targets- Illustrates the process and results of removing stationary target shadows

19: Background reconstruction- Compares the original image with the reconstructed background image

20: Detection results of moving target shadows- Compares the detection results of different methods, including the improved ViBe algorithm

Tables:

I: Table of Statistics for Moving Target Shadow Detection Results - Provides a comparison of detection rates and false alarm numbers for different methods across 100 frames of SAR images

These figures and tables are designed to illustrate the various steps of the proposed algorithm, demonstrate intermediate results, and compare the performance of the improved ViBe algorithm with other methods.

Improved ViBe Algorithm

Figure 11 in the document outlines the flowchart of the improved ViBe (Visual Background Extractor) algorithm for detecting moving target shadows in VideoSAR images. The algorithm consists of six main steps:

1. Background Model Initialization:
   - The algorithm expands the neighborhood from 3x3 to 5x5 pixels.
   - It performs 20 non-repetitive samplings within this 5x5 neighborhood to initialize the background model.
   - This approach improves robustness and reduces the selection of duplicate pixels.

2. Image Pixel Classification:
   - The algorithm classifies each pixel as either foreground or background.
   - It redefines the distance calculation between the current pixel and background model samples to better handle the characteristics of shadow detection.
   - The new distance formula is: d_i(x) = v_i - v(x), where v_i is the sample value and v(x) is the current pixel value.
   - This modification helps to better classify pixels when grayscale values change from low to high.

3. Pixel Filtering:
   - A background threshold T is introduced to filter out high grayscale value pixels that are unlikely to be shadows.
   - Pixels with values higher than T are reclassified as background, while those lower than T remain as originally classified.

4. Morphological Filtering:
   - This step applies morphological operations and connected component analysis to remove speckle noise, road edges, and other non-shadow areas.

5. Background Update:
   - The algorithm uses a conservative updating strategy.
   - Only pixels classified as background are included in the background model set.
   - This helps the algorithm adapt to gradual changes in the scene over time.

6. False Alarm Suppression:
   - This step utilizes information from multiple frames to reduce false alarms.
   - It sums up detection results from all frames to identify consistent moving targets.
   - Real moving targets typically show regular motion trajectories, while false targets appear more chaotic across frames.
   - Morphological filtering and connected component analysis are applied to extract Regions of Interest (ROIs).
   - False alarms outside these ROIs are suppressed.
   - Additionally, shadow detection methods based on super-pixel segmentation are used to extract shadow information for each frame, which helps suppress false alarms within the ROIs.

This improved ViBe algorithm is specifically tailored for detecting moving target shadows in VideoSAR images, addressing limitations of the original ViBe algorithm and incorporating domain-specific knowledge about shadow characteristics in SAR imagery.

Article

Published in: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing ( Volume: 17)

Page(s): 14575 - 14587
Date of Publication: 13 August 2024

ISSN Information:

Publisher: IEEE

Funding Agency:

SECTION I.

Introduction

Synthetic aperture radar (SAR) is a kind of radar that can work all day and all weather [1], [2], [3], [4], [5], [6]. Ground moving targets indication (GMTI) based on the SAR image has unique advantages, and the detection of moving targets is a research hotspot in the field of SAR-GMTI.

Traditional moving target detection is mainly performed in the signal domain. The commonly used methods can be mainly categorized into single-channel methods and multichannel methods. Multichannel moving target detection methods mainly include the space-time adaptive processing [7], displaced phase center antenna [8], and along-track interferometry [9], which have the advantage of detecting slow moving targets that are overwhelmed by the stationary clutter, but multichannel radar systems are more costly and have problems such as the channel mismatch. The single-channel moving target detection method is mainly the Doppler filtering method, and the clutter can be suppressed by the Doppler filtering process. However, the single-channel moving target detection method is difficult to detect moving targets submerged in the clutter.

In recent years, indirect GMTI techniques based on the ground moving target shadow detection have gradually attracted researchers' attention. Due to the shadow effect of moving targets on the electromagnetic wave, moving targets leave the shadows in SAR images [10]. These shadows can reflect the true position of moving targets, enabling the indirect detection of moving targets through the shadow detection [11]. Raynal et al. [10] from the Sandia National Laboratories have elaborated on the generation principles and characteristics of shadows in video synthetic aperture radar (VideoSAR) images. The energy of shadows in SAR images is relatively weak, and the direct detection would introduce a large number of the false alarms. In the VideoSAR image shadow detection, the most commonly used method is the background subtraction [12], [13], [14]. First, the background image is estimated using multiple frames, and then the difference between the background image and each frame is calculated to obtain shadow detection results. Tian et al. [15] used the track-before-detect method to address the moving target shadow detection in VideoSAR images, effectively improving the detection rate of moving target shadow and achieving the good false alarm suppression. Zhong et al. [16] combined the moving target shadow information with the echo energy information to detect moving targets, obtaining the good detection performance. Liu et al. [17] initialized the background model of the ViBe algorithm with the reconstructed background, effectively solving the “ghost” phenomenon when detecting moving target shadows with the ViBe algorithm. Using the deep learning for the moving target shadow detection is also a popular direction [18], [19], [20], [21], [22], [23]. Ding and Zhang et al. realized the moving target shadow detection in VideoSAR images using deep convolutional neural networks [18], [19], [20], [21].

Currently, there are some shortcomings in moving target shadow detection methods in the VideoSAR images. First, when there are many moving target shadows in the scene, the estimated background image is not accurate. Using inaccurate background images for shadow detection results in a high incidence of missed alarms and false alarms, ultimately leading to suboptimal detection performance. In addition, detection methods based on individual pixels ignore the structural information of the moving target shadow. Although moving target shadow detection methods based on deep convolutional neural networks have the excellent performance, they require a large amount of VideoSAR sample data for training and suffer from the poor generalization. To address these issues, this article proposes a moving target shadow detection algorithm based on the improved ViBe algorithm in VideoSAR images. This method first utilizes the super-pixel segmentation technology to extract the shadow information and local contrast information of SAR images, achieving the background reconstruction. Then, the reconstructed background image is used to initialize the background model of the improved ViBe algorithm. Finally, moving target shadows in VideoSAR videos are detected, and false alarms are suppressed using the false alarm suppression method.

The rest of this article is organized as follows. Section II introduces the methods of the background reconstruction, including the shadow detection, removal of stationary target shadows, and shadow pixels filling. Section III presents the processing flow and main ideas of the proposed moving target shadow detection method based on the improved ViBe algorithm in VideoSAR images. Section IV provides the experimental results of the moving target shadow detection in VideoSAR images. Finally, Section V concludes this article.

SECTION II.

Background Reconstruction

The ViBe algorithm possesses the ability to initialize its model using a single frame image, the outstanding performance in moving target detection, and a rapid background model updating strategy. However, if there are moving targets in the initial frame used for the background modeling in the ViBe algorithm, the “ghost” phenomenon may appear in the subsequent detection results, persisting for several frames or even dozens of frames. To mitigate the occurrence of the “ghost” phenomenon, it is necessary to perform the background reconstruction on the initial frame, removing the shadows of moving targets [17]. The flowchart of the background reconstruction is illustrated in Fig. 1, which primarily consists of three steps, i.e., the shadow detection, removal of stationary target shadows, and shadow pixels filling.

Fig. 1. - Flowchart of the background reconstruction.
Fig. 1.

Flowchart of the background reconstruction.

A. Shadow Detection

Image binarization is the most convenient method to obtain the region of interest (ROI). In order to obtain the shadow information in SAR images, binarization algorithms are first used to extract the darker regions in the images. Among binarization algorithms, the maximum between-class variance algorithm (OTSU algorithm) is widely used due to its high efficiency and good automation. Its main principle is to divide the pixels in the image into foreground and background using a grayscale threshold, and the selected threshold needs to ensure the maximum variance between the two classes of pixels. After obtaining the binary image, there will be a large number of false alarms in the image. Some of these false alarm regions are clearly not moving target shadows based on features such as the area, aspect ratio, and rectangularity. By using the morphological processing and connected component analysis, these regions can be effectively removed. Binarization and morphological filtering of SAR image are shown in Fig. 2.

Fig. 2. - Binarization and morphological filtering. (a) Original SAR image. (b) After binarization. (c) After morphological filtering.
Fig. 2.

Binarization and morphological filtering. (a) Original SAR image. (b) After binarization. (c) After morphological filtering.

After the binarization and morphological filtering, the ROIs with the darker appearance can be obtained. However, these darker regions may not necessarily be shadows. They could also be surfaces with low radar scattering coefficients, such as the road. Genuine shadow regions tend to be darker than their surroundings, whereas false shadow regions (such as the road with low radar scattering coefficients) exhibit the minimal difference in intensity compared to their surroundings. Exploiting this characteristic, a shadow detection method based on super-pixel segmentation can be employed to distinguish them.

Super-pixel segmentation is an image preprocessing technique that aggregates adjacent pixels with similar colors and textures into pixel blocks [24], [25], [26], [27]. Simple linear iterative clustering is a commonly used super-pixel segmentation algorithm, which not only produces excellent super-pixel segmentation results but also has the low computational complexity. If we represent a pixel using the three-dimensional coordinates [x,y,I], composed of pixel positions and pixel intensities, then the distance Dij between any two pixels can be expressed as

Dij=d2c+(W2S2)d2s=(xixj)2+(yiyj)2+(W2S2)(IiIj)2(1)
View SourceRight-click on figure for MathML and additional features.where dc represents the intensity space distance, ds represents the spatial position distance, S represents the expected size of super-pixel, and W represents the maximum distance of the pixel intensity. When W is small, it indicates that the intensity space distance is more important, and the super-pixel blocks can better preserve the edge structure of the image. When W is large, it suggests that the spatial position distance is more important, and the generated super-pixels are more regular. Super-pixel segmentation is performed on the original image marked with ROIs, as shown in Fig. 3.

Fig. 3. - Super-pixel segmentation.
Fig. 3.

Super-pixel segmentation.

After performing the super-pixel segmentation, it is necessary to construct a shadow detection window for each super-pixel block, as illustrated in Fig. 4. The shadow detection window consists of a protection window and a reference window. The purpose of the protection window is to avoid including pixels from shadow areas in the reference window. The reference window is used to calculate the average intensity of background clutter, and then compare the average intensity of the reference window with the average intensity of the target super-pixel s. When the ratio of the average intensity of the target super-pixel s to the average intensity of the reference window is less than a preset threshold, the target super-pixel s is determined to be a shadow region. Conversely, if this ratio is greater than the threshold, the target super-pixel s is determined to be a background region. The constructed detector can be expressed as

{H0:m0m1>χH1:m0m1<χ(2)
View SourceRight-click on figure for MathML and additional features.where H0 assumes the target super-pixel s is a background region, H1 assumes the target super-pixel s is a shadow region, χ represents the preset threshold. m0 and m1 represent the average intensity of the target super-pixel and the reference window, respectively. Fig. 4 illustrates the case where both the protection window and the reference window are two-level neighborhoods.

Fig. 4. - Schematic of the shadow detection window.
Fig. 4.

Schematic of the shadow detection window.

Using the shadow detection method based on the super-pixel segmentation, the shadow information for the entire image can be obtained. Next, this shadow information is used to filter the ROIs. If a region does not contain any shadow pixels, then it is reasonable to assume that this region is likely not a shadow region. Based on this premise, nonshadow regions are excluded from the ROIs. The specific filtering method involves determining whether the ROI contains shadow pixels. If it does, the current region is retained. Otherwise, it is removed. The process of filtering the ROIs using the shadow information is illustrated in Fig. 5. After using the shadow information for filtering, nonshadow dark regions can be effectively removed.

Fig. 5. - Flowchart for filtering ROIs based on the shadow information.
Fig. 5.

Flowchart for filtering ROIs based on the shadow information.

B. Removal of Stationary Target Shadows

In the ROIs obtained through binarization, morphological filtering, and shadow information-based filtering, shadows from both moving and stationary targets are included. When filling shadows for the background reconstruction, it is necessary to fill only the shadows of moving targets, not those of stationary targets. Therefore, it is essential to exclude the shadows of stationary targets and retain only the shadows of moving targets. Local contrast is an effective feature for distinguishing between stationary and moving target shadows [17]. As shown in Fig. 6, stationary targets do not deviate from their true positions in SAR images. Thus, the shadow neighborhoods of stationary targets contain imaging results of the stationary target itself, exhibiting the higher local contrast. On the other hand, moving targets deviate from their original positions due to their motion in SAR images, but their shadows remain near the true positions of the moving targets. Therefore, the shadow neighborhoods of moving targets do not contain imaging results of the moving target itself, resulting in the lower local contrast. By leveraging the difference in local contrast features between stationary and moving target shadows, it is possible to effectively distinguish the shadow regions between stationary and moving targets.

Fig. 6. - Stationary and moving targets and their shadows. (a) Stationary target and its shadow. (b) Moving target and its shadow.
Fig. 6.

Stationary and moving targets and their shadows. (a) Stationary target and its shadow. (b) Moving target and its shadow.

To calculate the local contrast, it is necessary to construct a local contrast detection window. As shown in Fig. 7, the traditional local contrast detection window uses the minimum bounding rectangle of the ROI as the central region of the detection window. Then, it constructs a complete detection window by building an eight-neighborhood of the same size as the central region. This type of the detection window ignores the edge structures of shadows and targets. The central region not only contains shadow pixels but also background clutter pixels. Similarly, for the neighboring regions containing targets, besides containing target pixels, they also include background clutter pixels. Moreover, situations may arise where a target spans two or even multiple neighborhoods. These issues can significantly impact the calculation of local contrast. To address this, this article proposes a redesigned local contrast detection window based on super-pixel segmentation. Super-pixel segmentation effectively preserves the edge structure information of shadows and targets in the image. It performs the super-pixel segmentation on the original image marked with ROIs. Then, it takes the super-pixel containing the ROI as the central region of the detection window, and directly adjacent regions to the central region as the neighborhood to complete the construction of the local contrast detection window. The local contrast detection window is illustrated in Fig. 8. By leveraging the ability of super-pixel segmentation to preserve the edge structure information in the image, the central region and neighboring regions including targets contain as few background clutter pixels as possible, resulting in the more accurate local contrast results. The formula for calculating local contrast C is given by

C=max(Gni)Gc,i=1,2,,k(3)
View SourceRight-click on figure for MathML and additional features.where Gc represents the average pixel intensity of the central region, Gni represents the average pixel intensity of the ith neighborhood, and k denotes the total number of neighborhoods. Since targets may appear in various orientations within shadows, when calculating the local contrast, it is advisable to select the neighborhood with the highest average pixel intensity for computation.

Fig. 7. - Traditional local contrast detection window.
Fig. 7.

Traditional local contrast detection window.

Fig. 8. - Local contrast detection window based on the super-pixel.
Fig. 8.

Local contrast detection window based on the super-pixel.

Computing the local contrast for all super-pixels in the image and binarizing them based on a preset threshold can obtain the local contrast information of the image. The acquired local contrast information is utilized to filter the ROIs. If a region does not contain pixels with the high local contrast, it is reasonable to assume that this region likely represents the shadow of a moving target. Based on this assumption, it can filter out the shadows of stationary targets from the ROIs, and then retain only the shadows of moving targets. The specific filtering method involves determining whether a region contains pixels with the high local contrast. If a region does not contain pixels with the high local contrast, the current region is retained. Otherwise, it is removed. The process of filtering ROIs based on the local contrast information is illustrated in Fig. 9. Utilizing local contrast information for filtering can effectively remove shadows of stationary targets.

Fig. 9. - Flowchart for filtering ROIs using the obtained local contrast information.
Fig. 9.

Flowchart for filtering ROIs using the obtained local contrast information.

C. Shadow Pixels Filling

After the shadow detection and removal of stationary shadow, the shadow regions of moving targets can be obtained. Next, it is necessary to fill in the shadow regions of the moving targets to complete the background reconstruction. According to the Markov random field theory, the background of shadow regions can be reconstructed using neighboring pixels of the shadow. The specific shadow filling method involves the random sampling of the shadow neighborhood, followed by filling the sampled neighborhood pixel values into the shadow region to complete the background reconstruction. The schematic of the shadow filling is shown in Fig. 10.

Fig. 10. - Schematic of shadow filling.
Fig. 10.

Schematic of shadow filling.

SECTION III.

Improved ViBe Algorithm

The ViBe algorithm is a nonparametric background modeling method distinguished by its rapid response, robust performance, and high detection accuracy. It fits the probability distribution of the background at each point directly based on the frequency of appearance of the background sampling points, without being limited to any specific distribution or parameters. After the background reconstruction, there are no longer any moving target shadows in the scene. Thus, a background model can be constructed based on the reconstructed background image, and the ViBe algorithm can be used to continue the shadow detection of moving targets in VideoSAR images. The original ViBe algorithm was designed for the optical videos for the moving target detection and did not utilize the prior information that the detected targets in VideoSAR images are shadows. Therefore, this article improves the ViBe algorithm to make it more suitable for detecting moving target shadows in VideoSAR images. The improved ViBe algorithm can be summarized into six steps, i.e., the background model initialization, image pixel classification, pixel filtering, morphological filtering, background update, and false alarm suppression, as shown in Fig. 11.

Fig. 11. - Flowchart of the improved ViBe algorithm.
Fig. 11.

Flowchart of the improved ViBe algorithm.

A. Background Model Initialization

The original ViBe algorithm repeats a random strategy 20 times within a 3×3 neighborhood, randomly selecting one pixel each time. A total of 20 pixels are chosen as the background sample set for the central pixel, thus completing the initialization of the background model. This characteristic grants the ViBe algorithm strong robustness, enabling it to effectively address the noise interference. However, selecting 20pixels randomly within each eight-neighborhood inevitably results in the selection of duplicate pixels [28]. To address this issue, this article opts to expand the neighborhood and perform 20 nonrepetitive samplings within a 5×5 neighborhood, which can complete the initialization of the background model, as illustrated in Fig. 12. The background model sample set M(x) for pixel x can be represented as

M(x)={v1,v2,v3,,v20}(4)
View SourceRight-click on figure for MathML and additional features.where vi represents the pixel value sampled at the ith time in the neighborhood of pixel x.

Fig. 12. - Diagram of the background modeling in the improved ViBe algorithm.
Fig. 12.

Diagram of the background modeling in the improved ViBe algorithm.

B. Image Pixel Classification

After completing the initialization of the background model, the next step is to classify the image pixels. That is to determine whether the image pixels belong to the foreground or the background. The classification process is shown in Fig. 13. The original ViBe algorithm first calculates the Euclidean distance in color space between each pixel x to be classified and the pixels in the background model sample set M(x). The formula for calculating the distance, di(x), which is given by

di(x)=(v(x)vi)2=|v(x)vi|,i=1,2,3,,20(5)
View SourceRight-click on figure for MathML and additional features.where v(x) denotes the gray value of the pixel point x. Define the distance threshold R. If the number of samples in the background model sample set M(x) whose distance from the current pixel point x is less than R is less than the threshold Min, the pixel point x is adjudicated as the foreground. Conversely, the pixel point is adjudicated as the background. The formula is expressed as
x={foreground,[v(x)R,v(x)+R]M(x)<Minbackground,[v(x)R,v(x)+R]M(x)Min.(6)
View SourceRight-click on figure for MathML and additional features.

Fig. 13. - Schematic of the pixel classification process.
Fig. 13.

Schematic of the pixel classification process.

It is known that the target to be detected is the shadow of a moving target, and the grayscale value of the shadow region is lower than that of the background. When the shadow region moves to pixel x, the trend of the grayscale value of the pixel x should change from high to low. If the distance defined by (5) is used, when a nonshadow region with a high grayscale value moves to pixel x, the trend of the grayscale value of the pixel x changes from low to high, resulting in |viv(x)|>R, which tends to classify the current pixel x as the foreground during the image pixel classification. However, when the trend of the grayscale value of the pixel x changes from low to high, it should be more inclined to classify the pixel x as the background. To address this issue, the distance di(x) is redefined as

di(x)=viv(x),i=1,2,3,,20.(7)
View SourceRight-click on figure for MathML and additional features.When the trend of the gray value of the pixel x is from low to high, viv(x)<R<R, the image pixel classification is performed with a tendency to adjudicate the current pixel x as the background, as shown in Fig. 14. The judgment formula for the pixel point x is
x={foreground,[,v(x)+R]M(x)<Minbackground,[,v(x)+R]M(x)Min.(8)
View SourceRight-click on figure for MathML and additional features.

Fig. 14. - Schematic of the pixel classification using two distances.
Fig. 14.

Schematic of the pixel classification using two distances.

In addition, the distance threshold R is determined by the degree of dispersion of the background model gray values. The distance threshold R is calculated by the formula [29]

R=m0.682(9)
View SourceRight-click on figure for MathML and additional features.where m denotes the median of |vivi+1|.

C. Pixel Filtering and Morphological Filtering

In the previous section, although modifying the calculation of the distance di(x) can suppress pixel classification errors caused by grayscale value changes from low to high, there are still many high grayscale value pixels classified as the foreground. This is because the grayscale value trend of some pixels changes from high to low, but after the change, the grayscale value remains high. To address this issue, considering the characteristic that the average pixel value of moving target shadow areas in VideoSAR images is lower than the background pixel value, a background threshold T can be preset to filter out pixels classified as the foreground that clearly do not belong to shadows, and reclassify high grayscale value pixels as the background. If the pixel value is higher than the background threshold T, then the pixel is reclassified as the background. If the pixel value is lower than the background threshold T, then the pixel is classified correctly. The formula is represented as follows:

x={background,foreground,v(x)>Tv(x)T.(10)
View SourceRight-click on figure for MathML and additional features.The results obtained after the reclassification using the background threshold T still contain speckle noise, road edges, and other areas that are not shadows of moving targets. Morphological filtering and connected component analysis can be used to eliminate the influence of these factors.

D. Background Model Update

During the motion detection process using video image, the scene depicted in the video can undergo changes due to a variety of factors, including camera jitter, falling leaves, or other dynamic environmental elements. In SAR videos, changes in the scattering coefficient of targets at different angles can cause variations in the final image scene intensity, as shown in Fig. 15. Since the ViBe algorithm compares the current pixel value with the sample values in the background model set, it is necessary to update the background model in real time when using the ViBe algorithm for the motion detection to ensure the more accurate detection results. The background model updating method used by the ViBe algorithm is conservative updating, meaning that a pixel is only included in the background model set when it is classified as the background [30].

Fig. 15. - Changes in the intensity of the imaging scene.
Fig. 15.

Changes in the intensity of the imaging scene.

E. False Alarm Suppression

After the aforementioned steps, preliminary detection results of moving target shadows in each frame of the SAR image have been obtained. However, the previous analysis was based on single-frame images, without utilizing the interframe information, which still leads to many false alarms. To address this issue, the next step involves false alarm suppression using the joint detection results of multiple frames.

Real moving targets typically exhibit regular motion trajectories, while false targets appear the more chaotic in multiple frame detections. Exploiting this characteristic, the detection results of all frames are summed up. If it is a real moving target, a long rectangular connected region will appear in the summation result. Utilizing the morphological filtering and connected component analysis based on this, the ROIs can be extracted. This approach helps in suppressing false alarms outside the ROIs, as depicted in Fig. 16.

Fig. 16. - Suppressing false alarms outside the ROIs.
Fig. 16.

Suppressing false alarms outside the ROIs.

In addition, another approach involves utilizing shadow detection methods based on the super-pixel segmentation for each frame to extract the shadow information. Then, this shadow information can be used to suppress false alarms within the ROIs. The specific procedure is similar to the one used in the background reconstruction and will not be reiterated here.

SECTION IV.

Experiment Results and Analysis

In this section, the moving target detection was performed on a sequence of VideoSAR images using the proposed improved ViBe algorithm. A comparative analysis was conducted with the results obtained from the original ViBe algorithm [17] and background subtraction algorithm [31]. The high-resolution VideoSAR data used in the experiment was obtained from the Sandia National Laboratories and captured a scene at the entrance of Kirtland Air Force Base, which features moving vehicles. In addition, the scene included various static targets, such as the trees, roads, buildings, and artificial islands. A subset of 100 frames of SAR images (frames from 153 to 252) was selected for the experiment, and preprocessing steps such as the registration, removal of invalid pixels, median filtering to suppress speckle noise, and intensity normalization were applied to the selected SAR images.

A. Background Reconstruction

The background reconstruction algorithm was applied to a single frame SAR image. The algorithm consists of three main steps, the shadow detection, removal of stationary target shadows, and pixel filling. The shadow detection results are shown in Fig. 17. From Fig. 17, it can be observed that the OTSU algorithm effectively extracts the dark regions in the image. After the morphological filtering, regions that are clearly not shadows of moving targets are removed based on features, such as the area, aspect ratio, and rectangularity. Finally, using the shadow information extracted by the shadow detection method based on the super-pixel segmentation, nonshadow regions are effectively eliminated.

Fig. 17. - Result of the shadow detection. (a) Original image. (b) Binary image. (c) Morphological filtering results. (d) Shadow information. (e) Result of the shadow information filtering.
Fig. 17.

Result of the shadow detection. (a) Original image. (b) Binary image. (c) Morphological filtering results. (d) Shadow information. (e) Result of the shadow information filtering.

After removing the shadows of stationary targets, the result is shown in Fig. 18. Local contrast information is extracted from the image after performing the super-pixel segmentation. Then, utilizing this extracted local contrast information effectively eliminates the shadows of stationary targets, leaving only the shadows of moving targets. Finally, the shadows of moving targets are filled using neighboring pixels to complete the background reconstruction, as shown in Fig. 19. After the reconstruction, there are no longer any shadows of moving targets in the background. The reconstructed background image is applied to both the improved ViBe algorithm and the original ViBe algorithm. Since the background model in the background subtraction algorithm cannot be updated in real time, there is a significant difference between the backgrounds of SAR images taken at different times. If the reconstructed background image is used as the background model, it would lead to many false alarms caused by background differences in the detection results. Therefore, the mean image of 100 frames of SAR images is used as the background model for the background subtraction algorithm in the subsequent comparative verification experiments.

Fig. 18. - Result after removing the shadows of stationary targets. (a) ROI. (b) Local contrast information. (c) Result after filtering with the local contrast information.
Fig. 18.

Result after removing the shadows of stationary targets. (a) ROI. (b) Local contrast information. (c) Result after filtering with the local contrast information.

Fig. 19. - Background reconstruction. (a) Original image. (b) Background reconstructed image.
Fig. 19.

Background reconstruction. (a) Original image. (b) Background reconstructed image.

B. Improved ViBe Algorithm

Fig. 20 shows the moving target shadow detection results of six different methods. Comparing the improved ViBe algorithm, the original ViBe algorithm with the false alarm suppression, and the background subtraction algorithm with the false alarm suppression. The detection results of the improved ViBe algorithm have only one missed detection, with no false alarms. The detection results of the original ViBe algorithm with the false alarm suppression have two missed detections and one false alarm. The detection results of the background subtraction algorithm with the false alarm suppression have one missed detection and two false alarms. Among these three moving target shadow detection methods, the detection performance of the improved ViBe algorithm is the best. Comparing with the improved ViBe algorithm without the false alarm suppression, the original ViBe algorithm, and the background subtraction algorithm, it is also evident that the improved ViBe algorithm has the fewest missed detections and false alarms, and its detection performance is the best. This validates that the proposed improved ViBe algorithm can effectively detect moving target shadows. Comparing the improved ViBe algorithm with and without the false alarm suppression, the original ViBe algorithm with and without the false alarm suppression, and the background subtraction algorithm with and without the false alarm suppression, it can be seen that all three methods show a significant reduction in the false alarms after the false alarm suppression, removing false alarms outside the ROI as well as false alarms in nonshadow regions within the ROI, verifying the correctness and effectiveness of the proposed false alarm suppression method. Table I summarizes the moving target shadow detection results of 100 frames of SAR images. From Table I, it can be seen that the improved ViBe algorithm has the fewest missed detections and false alarms, with a detection rate of 97.66% and only 8 false alarms. The number of false alarms is significantly reduced after the false alarm suppression for all three methods. Once again, this validates the excellent moving target shadow detection performance of the proposed improved ViBe algorithm.

TABLE I Table of Statistics for Moving Target Shadow Detection Results
Table I- Table of Statistics for Moving Target Shadow Detection Results
Fig. 20. - Detection results of moving target shadows. The ground truth positions are marked with red boxes in both the original images and the detection results of all methods.
Fig. 20.

Detection results of moving target shadows. The ground truth positions are marked with red boxes in both the original images and the detection results of all methods.

 

No comments:

Post a Comment

TMTT CFP Special Issue on Latest Advances on Radar-Based Physiological Sensors and Their Applications

Radar can be used for human non-contact monitoring and interaction TMTT CFP Special Issue on Latest Advances on Radar-Based Physiological Se...