Wake2Wake: Feature-Guided Self-Supervised Wave Suppression Method for SAR Ship Wake Detection | IEEE Journals & Magazine | IEEE Xplore
C. Xu, Q. Wang, X. Wang, X. Chao and B. Pan, "Wake2Wake: Feature-Guided Self-Supervised Wave Suppression Method for SAR Ship Wake Detection," in IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1-14, 2024, Art no. 5108114, doi: 10.1109/TGRS.2024.3422803.
Abstract: Sea clutter and inherent speckle noise in synthetic aperture radar (SAR) images can pose challenges to accurate sea surface target detection, especially for phenomena such as ship wakes reliant on efficient feature extraction. Traditional denoising methods require manual tradeoffs between denoising effects and detail retention. Supervised denoising methods based on deep learning demand a substantial number of real noisy-clean image pairs for training, coupled with specific parameter settings and labeled data amounts.
In response to these challenges, this article introduces Wake2Wake, a self-supervised denoising method aimed at enhancing the performance of existing deep learning-based ship wake detectors. The method incorporates a novel ship wake awareness (SWA) block designated to address the distinctive features of turbulent and Kelvin wakes. Furthermore, to overcome the source imbalance problem in the dataset, simulated wake data are integrated into the training process. This not only mitigates dataset imbalances but also significantly improves both denoising and detection performance.
The experimental results indicate that Wake2Wake improves the accuracy of Rotated RepPoints by 3.6 mAP and S2A-Net by 2.6 mAP on the OpenSARWake dataset, respectively. The proposed approach achieves varied extents of improvement, showcasing its potential in mitigating sea clutter and enhancing feature extraction, especially in detecting SAR ship wakes.
keywords: {Marine vehicles;Noise reduction;Radar polarimetry;Task analysis;Noise;Noise measurement;Detectors;Image denoising;self-supervised learning;ship wake detection},
URL: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10583959&isnumber=10354519
Summary
This article introduces a new method called Wake2Wake for improving synthetic aperture radar (SAR) ship wake detection. Key points:
- 1. Wake2Wake is a self-supervised denoising method designed to enhance the performance of existing deep learning-based ship wake detectors.
- 2. It incorporates a novel "ship wake awareness" (SWA) block to address the unique features of turbulent and Kelvin wakes in SAR images.
- 3. The method uses a mix of real and simulated SAR images for training to overcome dataset imbalances and improve performance.
- 4. Wake2Wake improved detection accuracy of existing methods (e.g. Rotated RepPoints, S2A-Net) on the OpenSARWake dataset.
- 5. The approach uses a two-stage strategy: first denoising SAR images, then using the denoised images for wake detection.
- 6. Experiments showed Wake2Wake outperformed traditional denoising methods and other deep learning approaches for this task.
- 7. The authors created a new SAR ship wake dataset called OpenSARWake for training and evaluation.
- 8. Ablation studies demonstrated the benefits of using simulated data and the novel SWA block design.
- 9. Future work aims to integrate Wake2Wake with detectors in an end-to-end network.
Overall, Wake2Wake addresses challenges in SAR ship wake detection by using self-supervised learning and specialized neural network components to better handle the unique characteristics of ship wakes in SAR imagery.
Wake2Wake Processing and Performance
Wake2Wake is a self-supervised denoising method designed to enhance SAR ship wake detection. Here's an overview of how it works and its performance:
Processing steps:
- Input: Noisy SAR images containing ship wakes
- Global Masker: Generates four masked image patches from each input image
- Modified U-Net with SWA block: Processes the masked patches
- Global Mask Mapper: Applies sampling to the denoised output
- Loss calculation: Uses a combination of "re-visible" loss and regularization loss
- Output: Denoised SAR image with enhanced wake features
Key components:
1. Ship Wake Awareness (SWA) block:
- Combines Dynamic Snake Convolution (DSConv) to capture turbulent wake features
- Uses Frequency Channel Attention Convolution (FcaConv) for Kelvin wake characteristics
- Designed to suppress background waves while enhancing wake visibility
2. Training strategy:
- Uses a mix of 65 real SAR images with prominent wave textures
- Incorporates 180 simulated SAR wake images to balance the dataset
Performance compared to other approaches:
- 1. Detection accuracy:
- - Improved Rotated RepPoints accuracy by 3.6 mAP
- - Improved S2A-Net accuracy by 2.6 mAP on the OpenSARWake dataset
- 2. Denoising metrics:
- - Achieved higher PSNR (Peak Signal-to-Noise Ratio) and SSIM (Structural Similarity Index) compared to other deep learning methods
- 3. Comparison to traditional methods:
- - Outperformed conventional denoising algorithms like LMMSE-Wavelet, SAR-BM3D, FANS, POTDF, and PPB
- - Traditional methods often decreased detection accuracy, while Wake2Wake improved it
- 4. Comparison to other deep learning approaches:
- - Surpassed performance of methods like SAR-CNN, NBR2NBR, and B2UB
- - Better preserved wake features while effectively suppressing waves and noise
- 5. Visual results:
- - Demonstrated superior wave suppression and wake preservation compared to other methods
- - Improved bounding box localization in detection tasks
- 6. Generalization:
- - Performed well across different SAR frequency bands (X, C, and L)
- - Effective on both real and simulated SAR images
Overall, Wake2Wake showed significant improvements in both denoising quality and subsequent detection accuracy compared to existing approaches, particularly in its ability to preserve important wake features while suppressing background noise and waves.
Artifacts
The paper mentions several resources that could potentially allow for independent validation of the results:
1. OpenSARWake Dataset:
The authors created and used a new dataset called OpenSARWake for training and evaluation. They state that this dataset can be accessed at:
https://github.com/libzzluo/OpenSARWake
This dataset includes:
- 653 C-band images
- 299 X-band images
- 2221 L-band images
- 231 images containing Kelvin wakes
- 2841 images containing turbulent wakes
2. Data Sources:
The SAR images in the OpenSARWake dataset were collected from:
- https://search.asf.alaska.edu/
- https://earth.esa.int/eogateway/catalog/
3. AIS Data:
For validating annotations of ship wakes acquired by ALOS-PALSAR and Sentinel-1A near Denmark, they used AIS data from:
https://web.ais.dk/aisdata/
4. Simulated Data:
The authors used 180 simulated SAR ship wake images. The simulator used for this task can be found at:
https://github.com/SYSUSARSimu/KWFullLink_SARSim
5. Satellite Data Products:
- C-Band Sentinel: GRD (Ground Range Detected) products
- L-Band ALOS PALSAR: H2.2-level products
- X-Band TerraSAR-X: Primarily SLC (Single Look Complex) products
6. Data Processing:
The authors mention using SNAP software to process the data and generate geocoded ellipsoid-corrected (GEC) products.
While the paper doesn't explicitly mention code availability for the Wake2Wake method itself, the availability of the dataset and simulators could allow for independent validation of the results and comparison with other methods.
It's worth noting that while these resources are mentioned in the paper, their current availability and accessibility may vary. Researchers interested in reproducing the results or building upon this work would likely need to contact the authors for the most up-to-date information on accessing these resources.
Figures
Here's a list the figures mentioned in the paper and their purpose and interpretation:
Figure 1:
- Shows three different methods for SAR ship wake detection
- (a) Direct detection on noisy SAR image
- (b) Simultaneous denoising and detection in a single network
- (c) Two-stage approach with separate denoising and detection (the approach used in this paper)
Interpretation: Illustrates the evolution of approaches, with the authors choosing the two-stage method for its flexibility and potential for improvement.
Figure 2:
- (a) Architecture of the proposed SWA (Ship Wake Awareness) block
1. Input Feature Map: The initial input to the SWA block.
2. Dynamic Snake Convolution (DSConv):
- This is applied to extract features of the turbulence wake.
- DSConv uses a flexible grid structure that can adapt to linear features like turbulent wakes.
3. Frequency Channel Attention Convolution (FcaConv):
- Applied after DSConv to capture Kelvin wake characteristics.
- It uses Discrete Cosine Transform (DCT) to analyze frequency components.
4. Concatenation: The outputs from DSConv and FcaConv are combined.
5. 1x1 Convolution: Applied to reduce the channel dimension of the concatenated features.
6. Output Feature Map: The final output of the SWA block, now enhanced with wake-specific features.
- (b) Feature-guided self-supervised denoising training and inference process
Interpretation: Provides a detailed view of the novel components in the Wake2Wake method, showing how it processes SAR images to enhance wake features.
- 1. Input: A noisy SAR image y.
- 2. Global Masker θ(·):
- - Divides the input into 2x2 cells.
- - Creates four masked versions of the input (θ00I, θ01I, θ10I, θ11I).
- 3. Stack: The four masked images are stacked into a single tensor θy.
- 4. Modified U-Net with SWA blocks (F_θ):
- - This is the main denoising network.
- - It includes the SWA blocks to focus on wake features.
- - Processes the stacked masked images.
- 5. Global Mask Mapper M(·):
- - Applied to the output of the U-Net.
- - Maps the denoised result back to the original image structure.
- 6. Loss Calculation:
- - L_rev: A "re-visible" loss comparing the mapped output to the input.
- - L_reg: A regularization loss.
- - The total loss is a combination of these two components.
- 7. Output: The final denoised SAR image.
The framework operates in a self-supervised manner, meaning it doesn't require clean ground truth images for training. Instead, it learns to denoise by reconstructing masked portions of the input image.
During inference (testing), the process is simplified:
1. The noisy SAR image is input directly to the trained U-Net with SWA blocks.
2. The network produces the denoised output without the masking and mapping steps.
This framework is designed to effectively suppress sea clutter and noise while preserving and enhancing ship wake features, which is crucial for subsequent wake detection tasks.
Figure 3:
- Comparison of Lipschitz constants for different attention modules
Interpretation: Shows the theoretical stability of various attention mechanisms, with the size of circles indicating computational complexity (FLOPS). A smaller Lipschitz constant suggests more stable training.
Figure 4:
- Implementation detail of the Global Mask θ(·) and Global Mask Mapper M(·)
Interpretation: Illustrates the masking strategy used in the self-supervised learning process, showing how 2x2 mask cells are applied to the input image.
Figure 5:
- (a) Geographic location and temporal distribution of the OpenSARWake dataset
- (b)-(e) Statistical distribution of instances in the dataset
Interpretation: Provides an overview of the dataset used, showing its global coverage and diversity in terms of image characteristics and wake types.
Figure 6:
- Effects of datasets' samples on manifold distributions
- (a) OpenSARWake dataset
- (b) Proposed multisource dataset
Interpretation: Demonstrates how the addition of simulated data enhances the feature space coverage, potentially improving the model's generalization ability.
Figures 7-9:
- Visual comparisons of denoising results and detection outcomes for different methods
Interpretation: These figures showcase the superior performance of Wake2Wake in preserving wake features while suppressing noise and waves, leading to improved detection accuracy.
Figure 10:
- Comparison of the effects of FcaConv and DSConv used in the SWA block
Interpretation: Illustrates how different components of the SWA block contribute to enhancing different types of wake features.
Figure 11:
- Sea clutter components removed by the SWA block
Interpretation: Shows the effectiveness of the SWA block in isolating and removing sea clutter, which helps in enhancing the visibility of ship wakes.
Overall, these figures provide a comprehensive view of the Wake2Wake method, from its architectural design to its performance on real SAR images, demonstrating its effectiveness in improving ship wake detection.
Authors
Based on the information provided in the article, here are the details about the authors, their associated institutions, and some related work:
Authors and Institutions:
1. Chengji Xu
- Member, IEEE
- Affiliated with the School of Systems Science and Engineering, Sun Yat-sen University, Guangzhou, China
2. Qingsong Wang (Corresponding author)
- Affiliated with the School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen, China
3. Xiaoqing Wang (Corresponding author)
- Affiliated with:
a) School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen, China
b) Peng Cheng Laboratory, Shenzhen, China
4. Xiaopeng Chao
- Affiliated with the School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen, China
5. Bo Pan
- Affiliated with the School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen, China
Related Work:
The paper cites several related works that provide context for their research:
1. Previous SAR ship wake datasets:
- Ding et al. [8] collected 261 SAR images with ship wakes
- Del Prete et al. [22] collected 485 SAR images with ship wakes
2. Self-supervised denoising methods:
- Yuan et al. [12] used a semantic-aware self-supervised denoising method for SAR image segmentation
- Li et al. [23] integrated dehazing and detection tasks into a unified network
3. SAR image denoising algorithms:
- SAR-BM3D [24]
- LMMSE-Wavelet [25]
- POTDF [26]
4. Ship wake detection methods:
- Xue et al. [32] used frequency channel attention convolution (FcaConv) for optical Kelvin wake detection
5. SAR image simulation:
- The authors reference their own previous work on SAR image simulation [41], [42]
6. Object detection frameworks adapted for SAR images:
- Rotated RepPoints [51]
- S2A-Net [52]
- Oriented R-CNN [53]
The authors have also published related work, including:
- A paper on the OpenSARWake dataset [34]
- Research on ship turbulent wake simulation [41]
- Work on wave spectrum retrieval methods [42]
This research was supported in part by the National Natural Science Foundation of China, the Xiaomi Young Talents Program, and another project grant.
SECTION I. Introduction
Synthetic aperture radar (SAR) is highly valuable for observing oceanic phenomena [1], detecting ships [2], [3], [4], [5], [6], [7], and identifying wakes [8], [9], [10], [11] due to its capabilities of all-weather, all-day observation and high-resolution imaging. Owing to the coherence principle underlying the SAR imaging system, the generation of speckle noise is inevitable [12]. Furthermore, ocean incidents such as huge wind waves and surges on the sea surface significantly impact the image quality, affecting the subsequent process of extracting features and the analysis performance of downstream tasks. SAR ship wake detection faces challenges in achieving optimal model accuracy due to complex and unpredictable environmental disruptions, such as background sea clutter and adverse weather conditions such as strong winds and high waves. To improve the accuracy of SAR ship wake detection, it is crucial to explore techniques for removing noise from SAR images or reducing interference from sea waves.
As a derivative research branch of SAR ship detection, the study of wake detection algorithms helps estimate the ship’s speed [13], [14], route [15], or hull information [16]. Traditional algorithms for detecting ship wakes in SAR images rely on techniques such as Randon transform [17], constant false alarm rate (CFAR) detection [18], and their variations. Other algorithms utilize sparse regularization [19] or decomposition [20], as well as morphological analysis, to distinguish the wake structure from the waves individually. In [21], dictionary modeling is employed in morphological analysis to separate the wake structure from the waves. Nevertheless, these techniques, primarily reliant on linear structural characteristics, are prone to misinterpretation due to their resemblance to other comparable formations in SAR images, such as small-scale internal waves and fronts.
The availability of several labeled SAR wake datasets has led to advancements in end-to-end convolutional neural network (CNN) techniques. These advancements have resulted in enhanced efficiency and accuracy in wake detection. For instance, Ding et al. [8] and Del Prete et al. [22] collected 261 and 485 SAR images with ship wakes that were carefully labeled. They compared the effectiveness of the respective detection algorithms using these images. However, the algorithms mentioned above need to extract information regarding the characteristics of the wake directly from SAR images, which inherently contain speckle noise and waves. In addition, their aim is to restore the wake area as accurately as possible to establish precise bounding boxes. The task possesses a certain level of complexity without necessitating additional interventions.
In a previous study, Yuan et al. [12] employed a jointly optimized semantic-aware self-supervised denoising method to enhance the performance of the segmentation task. Similarly, Li et al. [23] integrated dehazing and detection tasks into a unified network, introducing a union architecture and a novel robust loss function. Nevertheless, limited research has been conducted to evaluate the effectiveness of enhanced SAR images following restoration processes (such as denoising or wave suppression) on downstream tasks. Building on [23], we hypothesize that the denoising process may impact the detection results of ship wakes.
To address the disparity between high-level SAR ship wake detection and low-level image denoising, we will examine the framework’s overall design and the choice of denoising methods. Fig. 1(a) illustrates the process of delivering the unprocessed noisy image to the detector for training and testing. This approach is often used in prior SAR detection algorithms, and there is significant potential for improvement. Fig. 1(b) illustrates a type of jointly optimized framework, represented by Yuan et al. [12] and Li et al. [23]. This approach enhances the performance of image restoration and downstream tasks simultaneously. However, it necessitates a distinct design of the loss function or adding a new module to establish a connection between the two tasks. The third approach, as depicted in Fig. 1(c), employs a two-stage procedure where the detection task and image denoising are performed independently. The benefit of this approach lies in the fact that the refined restoration algorithm can enhance the performance of the detection task to a certain degree. In addition, there are few denoising algorithms available for SAR ship wake. Given different detection algorithms, we will employ a two-stage strategy to design the framework, facilitating subsequent experimental validation. When choosing denoising algorithms, gathering enough clean-noisy image pairs for supervised algorithms that meet the criteria for real SAR wake images is challenging. Therefore, we will explore using self-supervised denoising algorithms that rely on deep learning. These algorithms offer superior generalization capabilities and end-to-end integration compared to conventional methods, such as SAR-BM3D [24], LMMSE-Wavelet [25], and POTDF [26].
To this end, we propose a self-supervised denoising framework named Wake2Wake. This framework can improve the effectiveness of current mainstream-oriented SAR ship wake detectors. To effectively exploit the Kelvin wake and turbulence wake characteristics in SAR images, we have developed a ship wake awareness (SWA) block that concentrates on these two distinct wakes. The module aims to eliminate background waves and noise while enhancing the visibility of the wakes as foreground elements. As a result, we have implemented a more efficient data integration strategy to train Wake2Wake. This strategy involves combining real SAR wake data with simulated data, resulting in improved training efficiency and denoising effectiveness to some extent. These simulated SAR images effectively enhance the singularity and repetitiveness of ship wakes in certain real scenarios. They are well-balanced in terms of radar parameters, sea state parameters, and ship parameters compared to real images. Multiple extensive comparisons and ablation experiments provide evidence that our suggested technology can enhance the performance of current SAR wake detectors to a certain degree. The SAR ship wake dataset that we collected, including rotation detection labels, can be accessed at (https://github.com/libzzluo/OpenSARWake).
Proposed Method
This section introduces a feature-guided self-supervised wave suppression module named Wake2Wake. The method takes SAR images containing waves or speckle noise as input aims to produce output images with minimal waves or noise. Wake2Wake can be integrated with various oriented detectors to create a comprehensive framework, resulting in enhanced performance for wake detection using deep learning. This section will provide explicit details.
A. Overall Network Overview
Given that our primary objective is to enhance the performance of current SAR ship wake detectors, we do not allocate effort to create noisy-clean image pairs for supervised denoising learning. Instead, we employ unsupervised denoising methods. Current mainstream unsupervised denoising algorithms primarily utilize pixel masking strategies on input images. This approach requires the network to reconstruct the masked pixels using information from neighboring pixels. Representative examples of such algorithms are NBR2NBR [27] and N2V [28], effectively alleviating the reliance on identity mapping. Nevertheless, these blindspot schemes also suffer from the disadvantage of information loss, as the network cannot perceive the information from the masked pixels. For these reasons, we decided to employ the B2UB [29] network. This network utilizes the original noisy image more efficiently compared to the NBR2NBR and N2V approaches. In addition, it can utilize information from all input pixel points, theoretically preventing any loss of information.
Nevertheless, the level of noise pollution in SAR images surpasses that in optical images. When B2UB is trained exclusively on SAR ship wake images without modification, the denoising results are strikingly inadequate, as demonstrated by our research. The complexity of the sea clutter and noise distribution in SAR ship wake images makes it challenging to perform direct self-supervised denoising of the entire dataset. However, we drew insights from the advancements in the SAR image task by Yuan et al. [12]. They argue that the application of max pooling in the U-Net architecture of the original self-supervised algorithm is suboptimal, as it tends to generate speckle noise residues in the low-frequency area of the SAR image. This aspect is crucial to consider while addressing ship wake characteristics. Thus, we first modify the training strategy of the self-supervised denoising network and enhance all convolution operations in the original U-Net structure. We also provide an SWA block concentrating on Kelvin wake and turbulent wake characteristics. We refer to this enhanced framework as Wake2Wake.
After the denoised SAR ship wake images are produced by the Wake2Wake network, they are utilized as input for training and testing the ship wake detector. In Section III-B, we analyze the performance variations of Wake2Wake when it is employed on the target detection dataset in the training/validation set, the testing set, and the entire dataset, respectively. In Sections II-B and II-C, we provide the underlying construction principle of the SWA block and a more comprehensive explanation of the precise algorithmic flow of Wake2Wake. To enhance comprehension, we present the pseudo-code for the entire framework flow in Algorithm 1. In Section II-B, we will introduce methods to enhance the original U-Net model, addressing SAR ship wake feature extraction requirements more effectively.
B. SWA Block
In the broader realm of computer vision, in-context learning [30] has shown notable success in tasks such as image denoising, image enhancement, and image deduplication. Notably, these achievements have been realized without the need for parameter tuning. It seems that refining the structure of CNN does not result in significant advancements in these general domains. However, given the distinct features of the ship wake, a comprehensive solution has yet to emerge in the realm of SAR ship wake recognition. Therefore, we enhance the U-Net architecture utilized in the original B2UB network by considering the unique characteristics of SAR ship wakes. We introduce an SWA block to selectively replace the original convolution operation. Subsequently, we evaluate its Lipschitz constant [31] in comparison to other commonly used attention blocks to assess its generalization capability.
Typically,
ship wakes frequently appearing in SAR images consist mainly of
turbulent wakes and Kelvin wakes. Our focus is on describing these two
types of wakes. Xue et al. [32] employed the frequency channel attention convolution (FcaConv) [33]
to investigate the characteristics of Kelvin wakes in optical ship
wakes, examining both their image and frequency domains. The
experimental results provided evidence that the module effectively
captures the optical Kelvin wakes. However, when using the FcaConv with
our proposed OpenSARWake [34],
we observe that selecting only the discrete cosine transformation (DCT)
bases of the top 32 leads to a minor improvement in detection accuracy.
Conversely, selecting other frequency components results in a decrease
in the original detection accuracy. The absence of Kelvin wakes in SAR
images is a common occurrence, with only a turbulent wake, referred to
as a black streak, being typically evident. Furthermore, the presence of
sea clutter in optical images is usually represented by a Gaussian
distribution. However, sea clutter in SAR images is considerably more
complicated, posing challenges in differentiating low-frequency ship
wakes from the sea clutter. Hence, we opted for an
The convolution grid along the y-direction can be represented as
Typically,
Subsequently, the feature map after DSConv, denoted as
Finally,
In our practical experiments, we replaced all the plain convolutions in the original U-Net with SWA_block, and the final improved U-Net structure is shown in Table I.
To quantitatively analyze the robustness of our suggested SWA_block, we computed its Lipschitz constant and compared it with other common attention blocks. While accurately determining the Lipschitz constant of a multilayer network is a nondeterministic polynomial-time hard (NP-hard) problem, we can make an approximation by calculating an upper bound on the Lipschitz constant of the network. For instance, a convolutional block in a typical ResNet can be represented as
For all x,
Squaring both sides of the inequality in (9) gives
According to the slope-restricted nonlinearity definition in [38], the function
To understand the connection between incremental quadratic constraints and slope-restricted nonlinearity, (11) can equivalently be expressed as the following single inequality:
By multiplying by
The definition of
Eventually, we employed the LipSDP [38] algorithm to approximate the Lipschitz constants of various common attention blocks and determine the computational floating point operations per second (FLOPS), as depicted in Fig. 3.
C. Self-Supervised Denoising Framework Wake2Wake
We selected B2UB as the baseline method for several reasons. First, it is a self-supervised denoising technique that only requires a noisy SAR image. Second, it effectively utilizes all the information in the original noisy image. Finally, it applies a more stable Re-visible loss function throughout the training phase. However, most self-supervised denoising methods were initially developed for optical images [39], [40]. These methods do not consider the distinctive features of SAR sea images, particularly ship wakes such as Kelvin and dark streak-like turbulent wakes. Waves and sea clusters often obscure these wakes. In Sections II-A and II-B, we outline the enhancements that we have implemented to achieve this objective.
Our primary objective is to integrate the trained Wake2Wake model into the OBB detector. Typically, we use the entire detection dataset as inputs for the denoising network. However, challenges arise in simultaneously learning all noise distributions in the detection dataset without increasing the network’s capacity, often necessitating refined parameterization. A comprehensive explanation of the collection and preprocessing of the OpenSARWake dataset is provided in Section III-A. To address this, we propose utilizing a smaller yet more feature-enriched set of training samples. This subset comprises 65 carefully selected SAR images featuring prominent wind waves and wakes from the original OpenSARWake dataset. In addition, we include 180 simulated SAR images generated using a total of 60 sea conditions with the full-link ocean surface SAR imaging simulator [41], [42]. This simulation covers five distinct wind speeds ranging from 5.5 to 13.5 m/s and 20 wind directions at intervals of 30°. To ensure a thorough evaluation, three different bands of SAR parameters were employed, resulting in diverse and extensive characterizations. For further details on the acquisition and generation principles of this specific data subset, please refer to Section III-A.
Hence, the input of our newly designed Wake2Wake network is represented by
The input image
yn is partitioned into⌊W/s⌋× separate cells, each with a size ofs× . For our experiment, we fixed the W and H values to 1024. When is greater than 2, the computational load of the masking operation increases geometrically; therefore, s is set to 2 in this article, as presented in Fig. 4.Before being submitted to the Global Mask Mapper
M(⋅) for sampling,(θ00I,θ01I,θ10I,θ11I) must be organized into a masked volumeθy and processed by our modified U-Net modelFΩ(⋅) and Global Mask MapperM(⋅) to obtainM(FΩ(y)) .In another sequence, the initial noisy image
yn is immediately fed into theFΩ(⋅) to obtain the denoised outputFΩ(y) .
The loss function is defined as follows during the training process of Wake2Wake:
The hyperparameter
Following the completion of Wake2Wake training, the denoised OpenSARWake dataset is employed to train various types of OBB detectors. For further details on this process, please refer to the pseudocode in Algorithm 1.
Experimental Results and Discussion
A. Details of Datasets and Implementation
To validate the effectiveness of our proposed Wake2Wake method, it is imperative to train and evaluate it using an extensive dataset of SAR ship wakes. However, there is a scarcity of publicly available benchmarks in this domain. Table II demonstrates that Del Prete et al. [22] and Ding et al. [8] have gathered two SAR ship wake datasets of a certain scale in the C-band. However, their datasets are limited since they only cover the C-band and are inaccessible to the public. Consequently, the available SAR ship wake images are insufficient to meet the requirements of data-driven deep learning networks. In order to facilitate Wake2Wake training and subsequent deep learning-based SAR ship wake detection, we constructed an OpenSARWake dataset. Table II demonstrates that we considered the diversity of ship wake characteristics in various frequency bands, including X-, C-, and L-bands. To accomplish this, we gathered SAR images with ship wakes from these bands from offshore regions, as depicted in Fig. 5(a). This facilitated the collection of the related automatic identification system (AIS) data, which can be utilized for potential subsequent investigations. Fig. 5(b)–(e) illustrates the statistical distribution of our proposed OpenSARWake dataset. These images were obtained from https://search.asf.alaska.edu/ and https://earth.esa.int/eogateway/catalog/.
We
collected ground-range detected (GRD) products for the C-Band Sentinel
and H2.2-level products for the L-Band ALOS PALSAR images. For X-Band
TerraSAR-X images, we primarily used single-look complex (SLC) products.
To maintain consistent quality, we utilized the SNAP software to
process the data and generate geocoded ellipsoid-corrected (GEC)
products. In addition, essential preprocessing techniques, such as the
adaptive contrast algorithm, were applied, and the image was ultimately
resized to dimensions of
Not all sea surface SAR images exhibit significant wave activity in real life. However, these images typically show distinct turbulence wakes with greater clarity. Thus, it is expected that Wake2Wake can emphasize denoising and effectively suppress waves to enhance the detection performance of SAR ship wakes that are affected by significant wave interference. While constructing the feature space, namely, the data distribution, for training Wake2Wake, we deliberately selected 65 SAR images with clearly visible waves and ship wakes from the OpenSARWake dataset. The OpenSARWake collection comprises 408 scenes of SAR images from ALOS PALSAR, 120 scenes of images from Sentinel 1, and only 55 scenes of images from TerraSAR-X. It is important to note that ship wakes in different bands have their own unique characteristics. Therefore, if there are too few or too many images in a particular band, it may cause the denoiser to learn biased wake features. However, this issue can be mitigated to some extent by adding simulated SAR wake images. In addition, we integrated 180 simulated SAR ship wake images. Commonly observed ship wake types in SAR images include Kelvin wake, turbulence wake, internal wave wake, and narrow-V wake [45]. Owing to their infrequency, we have excluded the internal wave wake and narrow-V wake from our simulation. The Kelvin wake refers to a modeling method based on the potential flow theory proposed by [46]. Based on this method, we also consider the sea surface motion, the time-varying characteristics of the scattering units, and the decoherence effect of the sea surface in the echo simulation. The simulator used for this task can be found at https://github.com/SYSUSARSimu/KWFull-Link_SARSim. To model the turbulent wake, we employ our simulation strategy [41] based on the energy spectrum balance equation. This method has the additional advantage of accurately simulating the turbulent wake across a distance that could exceed kilometers. We simulate SAR images in three frequency bands: X-band, C-band, and L-band. The choice of SAR parameters is guided by that used in [47]. To enhance the variety of the simulated ship wakes, we utilize five different wind speeds ranging from 5.5 to 13.5 m/s and 12 wind directions with intervals of 30° for the sea surface simulation. The simulation approach employed for the sea surface is based on a full-link ocean surface SAR imaging simulator [42]. Fig. 4 demonstrates the application of the unified manifold approximation and projection (UMAP) method to analyze the OpenSARWake dataset and our proposed mixed datasets. All the SAR images were projected into a 3D space, revealing a wide distribution in the manifold space. In Fig. 6(a), the cluster of feature points on the right-hand side indicates that most wake images share similar characteristics. In Fig. 6(b), the red and blue points represent the features of a selected subset of 65 OpenSARWake wake images and 180 simulated wake images, respectively. Although the total number of these images is smaller than the complete OpenSARWake dataset, they still occupy a similar space in the manifold distribution. This highlights the richness of the characteristics in the simulated wake images. Thus, to summarize, we employ a small-scale mixed-source SAR ship wake dataset to train Wake2Wake. However, the training and testing datasets utilized in our comprehensive framework comprise the whole OpenSARWake dataset.
B. Comparison With State-of-the-Art Methods
To
evaluate the performance disparity between our proposed Wake2Wake
method and other denoising algorithms on various detectors, we
specifically selected two categories of SAR image denoising algorithms
for comparison. These include traditional methods such as LMMSE-Wavelet [25], SAR-BM3D [24], FANS [48], POTDF [26], and PPB [49], as well as deep learning-based methods such as SAR-CNN [50], NBR2NBR [27], and B2UB [29]. For the rotated detectors, we selected anchor-free methods such as Rotated RepPoints [51], single-stage methods such as S2A-Net [52], and two-stage methods such as Oriented R-CNN [53]
for cross-comparison. We selected an equivalent number of looks (ENL),
peak signal-to-noise ratio (PSNR), and structural similarity index
(SSIM) for their recognized efficacy in similar SAR despeckling tasks,
evaluating homogeneity, reconstruction quality, and structural
integrity, respectively. These metrics are complemented by recall and
mean average precision (mAP) to assess the overall impact on detection
performance. While other metrics such as TCR [55] and EPD-ROA [56]
were considered, we prioritized those with a direct, established
correlation to despeckling effectiveness and downstream detection
capabilities. The experiments were conducted using NVIDIA A6000 GPUs.
For the denoising algorithms other than Wake2Wake, we used the same
hyperparameters as specified in their respective original papers. For
the training process of the Wake2Wake algorithm, the initial learning
rate was set to
The comparative analysis of several denoising algorithms on OpenSARWake is presented in Table III. The ENL is frequently employed to quantify the speckle suppression capabilities of various SAR image filters. A higher ENL value indicates effective smoothing. Traditional methods, POTDF and PPB, exhibit significantly higher ENL values than deep learning methods. However, we argue that this does not necessarily enhance the accuracy of ship wake detection. Table IV shows that the higher ENL values of POTDF and PPB decrease their mAP on detection algorithms, especially for Rotated RepPoints, which is an anchor-free object detector. Nevertheless, both methods enhance their detection recall when employing Type III training strategies. Furthermore, our examination of the ENL values of the deep learning algorithms did not reveal significant regularities.
PSNR and SSIM quantify the image quality by comparing the maximum signal to background noise and by evaluating image similarity before and after denoising, respectively. Similarly, in conventional methods, higher PSNR and SSIM values do not necessarily indicate higher mAP values. In the case of B2UB and Wake2Wake, slightly higher PSNR and SSIM values result in higher recall and mAP values compared to other deep learning methods. However, these methods are constrained by the limitations of their training strategy, as mentioned in Section III-A, which restricts their potential. Our proposed Wake2Wake method surpasses all other denoising algorithms on the detection accuracy of Rotated RepPoints and S2A-Net. This is achieved through an enhanced training strategy and the utilization of a customized SWA block designed specifically for ship wake features. Notably, our method significantly outperforms the results obtained without the application of a denoising algorithm. Currently, the consensus is that the application of traditional denoising algorithms tends to decrease the accuracy of all the detection methods listed in Table IV. This is primarily due to the fact that traditional denoising algorithms rely on a fixed approach to image processing, which lacks the capacity to adapt and generalize well. In addition, these algorithms are incapable of retaining and acquiring information regarding the characteristics present in the data. Furthermore, if not enhanced, the denoising techniques that rely on deep learning also diminish their effectiveness on the aforementioned three categories of detection algorithms. Overfitting and incomplete feature learning can negatively impact the model’s performance on the detector. In addition, disregarding ship wake features may reduce the effectiveness of the method. Our proposed Wake2Wake algorithm, built upon the research of previous scholars, addresses these issues.
Furthermore, we provide a visual comparison of the denoising and detection results from a test sample image, as depicted in Figs. 7–9. It is evident that the conventional methods for suppressing waves and noise lack stability, as both the POTDF and PPB algorithms exhibit some degree of oversmoothing in Fig. 7. Conversely, the deep learning algorithms demonstrate a certain level of overfitting. In contrast, our Wake2Wake algorithm effectively suppresses waves while preserving wake features, leading to improved accuracy in bounding-box localization, as demonstrated in Fig. 9.
C. Ablation Studies
In this section, we conducted ablation experiments to investigate the influence of the dataset used to train Wake2Wake and the convolution operation applied in the SWA block to focus on two distinct ship wake features in the final detection results. The experiments followed the Type III strategy outlined in Table IV, where the training, validation, and test sets for training the OBB detector were all denoised.
Table V(a) indicates that the inclusion of 180 simulated wake images has a certain impact on the accuracy of detection. These real wake images enable Wake2Wake to concentrate on learning and suppressing wave features. We hypothesize that superior results could have been achieved with real wake images exhibiting more diverse sea conditions. Table V(b) outlines the impact of various convolution operations on wave suppression and detection results in our SWA block. Correspondingly, Fig. 10 provides a visual comparison of the results mentioned in Table V(b). The texture of the suppressed sea waves remains distinctly visible in the absence of either FcaConv or DSConv, as depicted in Fig. 10(b) and (g). When FcaConv is employed, the spectral characteristics in Fig. 10(h) indicate a slight increase in wave scale. This can be attributed to the absence of a significant Kelvin transverse wave, with only a divergent wave present on the left side. Consequently, FcaConv proves to have a limited contribution to this particular test sample. However, when both FcaConv and DSConv are employed, as depicted in Fig. 10(j), it is evident that the texture of the waves is effectively alleviated. This indicates that the simultaneous application of these two convolutions yields beneficial effects, aligning with the observation that both types of ship wakes are present in this figure simultaneously. Furthermore, to visually illustrate the wave components removed in Fig. 10(e), we present in Fig. 11 the wave images suppressed by our proposed SWA block and the corresponding image spectrum. It can be observed that although a small portion of the wakes is suppressed along with the waves, the majority of the removed components are sea clutter.
Conclusion
This study introduces Wake2Wake, a feature-guided self-supervised method for suppressing waves in SAR ship wake detection. Our proposed Wake2Wake utilizes a novel self-supervised training strategy. Instead of training directly on the full-scale dataset, we selected 65 real SAR images with the most prominent wave textures from the dataset and combined them with 180 simulated wake images. The simulated SAR images generated by our previous work exhibit a balanced and ergodicity selection of radar parameters, sea conditions, and ship parameters, effectively compensating for the feature singularity of real SAR images in some scenarios. The results indicate that this training strategy produces increased detection accuracy compared to the original method. In addition, Wake2Wake employs a novel SWA block designed to capture the characteristics of turbulent and Kelvin wakes in SAR images. Results from experiments have proven the effectiveness of this block. Nevertheless, additional studies are required to achieve a tradeoff between wave suppression and wake detection. The next step involves merging the Wake2Wake framework with the OBB detector to create a real end-to-end network, following the approach illustrated in Fig. 1(b).
ACKNOWLEDGMENT
The authors would like to thank the editors who processed this article and the anonymous reviewers for their constructive comments toward improving it. They want to express our gratitude to the Alaska Satellite Facility and European Space Agency for the Sentinel, ALOS PALSAR, and TerraSAR-X data, available at https://search.asf.alaska.edu/#/ and https://earth.esa.int/eogateway/missions/terrasar-x-and-tandem-x.
No comments:
Post a Comment