Friday, July 19, 2024

Wake2Wake: Feature-Guided Self-Supervised Wave Suppression Method for SAR Ship Wake Detection | IEEE Journals & Magazine | IEEE Xplore

Wake2Wake: Feature-Guided Self-Supervised Wave Suppression Method for SAR Ship Wake Detection | IEEE Journals & Magazine | IEEE Xplore

C. Xu, Q. Wang, X. Wang, X. Chao and B. Pan, "Wake2Wake: Feature-Guided Self-Supervised Wave Suppression Method for SAR Ship Wake Detection," in IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1-14, 2024, Art no. 5108114, doi: 10.1109/TGRS.2024.3422803.

Abstract: Sea clutter and inherent speckle noise in synthetic aperture radar (SAR) images can pose challenges to accurate sea surface target detection, especially for phenomena such as ship wakes reliant on efficient feature extraction. Traditional denoising methods require manual tradeoffs between denoising effects and detail retention. Supervised denoising methods based on deep learning demand a substantial number of real noisy-clean image pairs for training, coupled with specific parameter settings and labeled data amounts. 

In response to these challenges, this article introduces Wake2Wake, a self-supervised denoising method aimed at enhancing the performance of existing deep learning-based ship wake detectors. The method incorporates a novel ship wake awareness (SWA) block designated to address the distinctive features of turbulent and Kelvin wakes. Furthermore, to overcome the source imbalance problem in the dataset, simulated wake data are integrated into the training process. This not only mitigates dataset imbalances but also significantly improves both denoising and detection performance. 

The experimental results indicate that Wake2Wake improves the accuracy of Rotated RepPoints by 3.6 mAP and S2A-Net by 2.6 mAP on the OpenSARWake dataset, respectively. The proposed approach achieves varied extents of improvement, showcasing its potential in mitigating sea clutter and enhancing feature extraction, especially in detecting SAR ship wakes.

keywords: {Marine vehicles;Noise reduction;Radar polarimetry;Task analysis;Noise;Noise measurement;Detectors;Image denoising;self-supervised learning;ship wake detection},
URL: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10583959&isnumber=10354519

Summary

This article introduces a new method called Wake2Wake for improving synthetic aperture radar (SAR) ship wake detection. Key points:

  • 1. Wake2Wake is a self-supervised denoising method designed to enhance the performance of existing deep learning-based ship wake detectors.
  • 2. It incorporates a novel "ship wake awareness" (SWA) block to address the unique features of turbulent and Kelvin wakes in SAR images.
  • 3. The method uses a mix of real and simulated SAR images for training to overcome dataset imbalances and improve performance.
  • 4. Wake2Wake improved detection accuracy of existing methods (e.g. Rotated RepPoints, S2A-Net) on the OpenSARWake dataset.
  • 5. The approach uses a two-stage strategy: first denoising SAR images, then using the denoised images for wake detection.
  • 6. Experiments showed Wake2Wake outperformed traditional denoising methods and other deep learning approaches for this task.
  • 7. The authors created a new SAR ship wake dataset called OpenSARWake for training and evaluation.
  • 8. Ablation studies demonstrated the benefits of using simulated data and the novel SWA block design.
  • 9. Future work aims to integrate Wake2Wake with detectors in an end-to-end network.

Overall, Wake2Wake addresses challenges in SAR ship wake detection by using self-supervised learning and specialized neural network components to better handle the unique characteristics of ship wakes in SAR imagery. 

Wake2Wake Processing and Performance

Wake2Wake is a self-supervised denoising method designed to enhance SAR ship wake detection. Here's an overview of how it works and its performance:

Processing steps:

  1. Input: Noisy SAR images containing ship wakes
  2. Global Masker: Generates four masked image patches from each input image
  3. Modified U-Net with SWA block: Processes the masked patches
  4. Global Mask Mapper: Applies sampling to the denoised output
  5. Loss calculation: Uses a combination of "re-visible" loss and regularization loss
  6. Output: Denoised SAR image with enhanced wake features

Key components:
1. Ship Wake Awareness (SWA) block:
   - Combines Dynamic Snake Convolution (DSConv) to capture turbulent wake features
   - Uses Frequency Channel Attention Convolution (FcaConv) for Kelvin wake characteristics
   - Designed to suppress background waves while enhancing wake visibility

2. Training strategy:
   - Uses a mix of 65 real SAR images with prominent wave textures
   - Incorporates 180 simulated SAR wake images to balance the dataset

Performance compared to other approaches:

  • 1. Detection accuracy:
    •    - Improved Rotated RepPoints accuracy by 3.6 mAP
    •    - Improved S2A-Net accuracy by 2.6 mAP on the OpenSARWake dataset
  • 2. Denoising metrics:
    •    - Achieved higher PSNR (Peak Signal-to-Noise Ratio) and SSIM (Structural Similarity Index) compared to other deep learning methods
  • 3. Comparison to traditional methods:
    •    - Outperformed conventional denoising algorithms like LMMSE-Wavelet, SAR-BM3D, FANS, POTDF, and PPB
    •    - Traditional methods often decreased detection accuracy, while Wake2Wake improved it
  • 4. Comparison to other deep learning approaches:
    •    - Surpassed performance of methods like SAR-CNN, NBR2NBR, and B2UB
    •    - Better preserved wake features while effectively suppressing waves and noise
  • 5. Visual results:
    •    - Demonstrated superior wave suppression and wake preservation compared to other methods
    •    - Improved bounding box localization in detection tasks
  • 6. Generalization:
    •    - Performed well across different SAR frequency bands (X, C, and L)
    •    - Effective on both real and simulated SAR images


Overall, Wake2Wake showed significant improvements in both denoising quality and subsequent detection accuracy compared to existing approaches, particularly in its ability to preserve important wake features while suppressing background noise and waves.

Artifacts

The paper mentions several resources that could potentially allow for independent validation of the results:

1. OpenSARWake Dataset:
   The authors created and used a new dataset called OpenSARWake for training and evaluation. They state that this dataset can be accessed at:
   https://github.com/libzzluo/OpenSARWake

   This dataset includes:
   - 653 C-band images
   - 299 X-band images
   - 2221 L-band images
   - 231 images containing Kelvin wakes
   - 2841 images containing turbulent wakes

2. Data Sources:
   The SAR images in the OpenSARWake dataset were collected from:
   - https://search.asf.alaska.edu/
   - https://earth.esa.int/eogateway/catalog/

3. AIS Data:
   For validating annotations of ship wakes acquired by ALOS-PALSAR and Sentinel-1A near Denmark, they used AIS data from:
   https://web.ais.dk/aisdata/

4. Simulated Data:
   The authors used 180 simulated SAR ship wake images. The simulator used for this task can be found at:
   https://github.com/SYSUSARSimu/KWFullLink_SARSim

5. Satellite Data Products:
   - C-Band Sentinel: GRD (Ground Range Detected) products
   - L-Band ALOS PALSAR: H2.2-level products
   - X-Band TerraSAR-X: Primarily SLC (Single Look Complex) products

6. Data Processing:
   The authors mention using SNAP software to process the data and generate geocoded ellipsoid-corrected (GEC) products.

While the paper doesn't explicitly mention code availability for the Wake2Wake method itself, the availability of the dataset and simulators could allow for independent validation of the results and comparison with other methods.

It's worth noting that while these resources are mentioned in the paper, their current availability and accessibility may vary. Researchers interested in reproducing the results or building upon this work would likely need to contact the authors for the most up-to-date information on accessing these resources.

Figures

Here's a list the figures mentioned in the paper and their purpose and interpretation:

Figure 1:
- Shows three different methods for SAR ship wake detection
- (a) Direct detection on noisy SAR image
- (b) Simultaneous denoising and detection in a single network
- (c) Two-stage approach with separate denoising and detection (the approach used in this paper)
Interpretation: Illustrates the evolution of approaches, with the authors choosing the two-stage method for its flexibility and potential for improvement.

Figure 2:
- (a) Architecture of the proposed SWA (Ship Wake Awareness) block

1. Input Feature Map: The initial input to the SWA block.

2. Dynamic Snake Convolution (DSConv):
   - This is applied to extract features of the turbulence wake.
   - DSConv uses a flexible grid structure that can adapt to linear features like turbulent wakes.

3. Frequency Channel Attention Convolution (FcaConv):
   - Applied after DSConv to capture Kelvin wake characteristics.
   - It uses Discrete Cosine Transform (DCT) to analyze frequency components.

4. Concatenation: The outputs from DSConv and FcaConv are combined.

5. 1x1 Convolution: Applied to reduce the channel dimension of the concatenated features.

6. Output Feature Map: The final output of the SWA block, now enhanced with wake-specific features.


- (b) Feature-guided self-supervised denoising training and inference process
Interpretation: Provides a detailed view of the novel components in the Wake2Wake method, showing how it processes SAR images to enhance wake features.

  • 1. Input: A noisy SAR image y.
  • 2. Global Masker θ(·):
    •    - Divides the input into 2x2 cells.
    •    - Creates four masked versions of the input (θ00I, θ01I, θ10I, θ11I).
  • 3. Stack: The four masked images are stacked into a single tensor θy.
  • 4. Modified U-Net with SWA blocks (F_θ):
    •    - This is the main denoising network.
    •    - It includes the SWA blocks to focus on wake features.
    •    - Processes the stacked masked images.
  • 5. Global Mask Mapper M(·):
    •    - Applied to the output of the U-Net.
    •    - Maps the denoised result back to the original image structure.
  • 6. Loss Calculation:
    •    - L_rev: A "re-visible" loss comparing the mapped output to the input.
    •    - L_reg: A regularization loss.
    •    - The total loss is a combination of these two components.
  • 7. Output: The final denoised SAR image.

The framework operates in a self-supervised manner, meaning it doesn't require clean ground truth images for training. Instead, it learns to denoise by reconstructing masked portions of the input image.

During inference (testing), the process is simplified:
1. The noisy SAR image is input directly to the trained U-Net with SWA blocks.
2. The network produces the denoised output without the masking and mapping steps.

This framework is designed to effectively suppress sea clutter and noise while preserving and enhancing ship wake features, which is crucial for subsequent wake detection tasks.

Figure 3:
- Comparison of Lipschitz constants for different attention modules
Interpretation: Shows the theoretical stability of various attention mechanisms, with the size of circles indicating computational complexity (FLOPS). A smaller Lipschitz constant suggests more stable training.

Figure 4:
- Implementation detail of the Global Mask θ(·) and Global Mask Mapper M(·)
Interpretation: Illustrates the masking strategy used in the self-supervised learning process, showing how 2x2 mask cells are applied to the input image.

Figure 5:
- (a) Geographic location and temporal distribution of the OpenSARWake dataset
- (b)-(e) Statistical distribution of instances in the dataset
Interpretation: Provides an overview of the dataset used, showing its global coverage and diversity in terms of image characteristics and wake types.

Figure 6:
- Effects of datasets' samples on manifold distributions
- (a) OpenSARWake dataset
- (b) Proposed multisource dataset
Interpretation: Demonstrates how the addition of simulated data enhances the feature space coverage, potentially improving the model's generalization ability.

Figures 7-9:
- Visual comparisons of denoising results and detection outcomes for different methods
Interpretation: These figures showcase the superior performance of Wake2Wake in preserving wake features while suppressing noise and waves, leading to improved detection accuracy.

Figure 10:
- Comparison of the effects of FcaConv and DSConv used in the SWA block
Interpretation: Illustrates how different components of the SWA block contribute to enhancing different types of wake features.

Figure 11:
- Sea clutter components removed by the SWA block
Interpretation: Shows the effectiveness of the SWA block in isolating and removing sea clutter, which helps in enhancing the visibility of ship wakes.

Overall, these figures provide a comprehensive view of the Wake2Wake method, from its architectural design to its performance on real SAR images, demonstrating its effectiveness in improving ship wake detection.

Authors

Based on the information provided in the article, here are the details about the authors, their associated institutions, and some related work:

Authors and Institutions:

1. Chengji Xu
   - Member, IEEE
   - Affiliated with the School of Systems Science and Engineering, Sun Yat-sen University, Guangzhou, China

2. Qingsong Wang (Corresponding author)
   - Affiliated with the School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen, China

3. Xiaoqing Wang (Corresponding author)
   - Affiliated with:
     a) School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen, China
     b) Peng Cheng Laboratory, Shenzhen, China

4. Xiaopeng Chao
   - Affiliated with the School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen, China

5. Bo Pan
   - Affiliated with the School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen, China

Related Work:

The paper cites several related works that provide context for their research:

1. Previous SAR ship wake datasets:
   - Ding et al. [8] collected 261 SAR images with ship wakes
   - Del Prete et al. [22] collected 485 SAR images with ship wakes

2. Self-supervised denoising methods:
   - Yuan et al. [12] used a semantic-aware self-supervised denoising method for SAR image segmentation
   - Li et al. [23] integrated dehazing and detection tasks into a unified network

3. SAR image denoising algorithms:
   - SAR-BM3D [24]
   - LMMSE-Wavelet [25]
   - POTDF [26]

4. Ship wake detection methods:
   - Xue et al. [32] used frequency channel attention convolution (FcaConv) for optical Kelvin wake detection

5. SAR image simulation:
   - The authors reference their own previous work on SAR image simulation [41], [42]

6. Object detection frameworks adapted for SAR images:
   - Rotated RepPoints [51]
   - S2A-Net [52]
   - Oriented R-CNN [53]

The authors have also published related work, including:
- A paper on the OpenSARWake dataset [34]
- Research on ship turbulent wake simulation [41]
- Work on wave spectrum retrieval methods [42]

This research was supported in part by the National Natural Science Foundation of China, the Xiaomi Young Talents Program, and another project grant.

SECTION I. Introduction

Synthetic aperture radar (SAR) is highly valuable for observing oceanic phenomena [1], detecting ships [2], [3], [4], [5], [6], [7], and identifying wakes [8], [9], [10], [11] due to its capabilities of all-weather, all-day observation and high-resolution imaging. Owing to the coherence principle underlying the SAR imaging system, the generation of speckle noise is inevitable [12]. Furthermore, ocean incidents such as huge wind waves and surges on the sea surface significantly impact the image quality, affecting the subsequent process of extracting features and the analysis performance of downstream tasks. SAR ship wake detection faces challenges in achieving optimal model accuracy due to complex and unpredictable environmental disruptions, such as background sea clutter and adverse weather conditions such as strong winds and high waves. To improve the accuracy of SAR ship wake detection, it is crucial to explore techniques for removing noise from SAR images or reducing interference from sea waves.

As a derivative research branch of SAR ship detection, the study of wake detection algorithms helps estimate the ship’s speed [13], [14], route [15], or hull information [16]. Traditional algorithms for detecting ship wakes in SAR images rely on techniques such as Randon transform [17], constant false alarm rate (CFAR) detection [18], and their variations. Other algorithms utilize sparse regularization [19] or decomposition [20], as well as morphological analysis, to distinguish the wake structure from the waves individually. In [21], dictionary modeling is employed in morphological analysis to separate the wake structure from the waves. Nevertheless, these techniques, primarily reliant on linear structural characteristics, are prone to misinterpretation due to their resemblance to other comparable formations in SAR images, such as small-scale internal waves and fronts.

The availability of several labeled SAR wake datasets has led to advancements in end-to-end convolutional neural network (CNN) techniques. These advancements have resulted in enhanced efficiency and accuracy in wake detection. For instance, Ding et al. [8] and Del Prete et al. [22] collected 261 and 485 SAR images with ship wakes that were carefully labeled. They compared the effectiveness of the respective detection algorithms using these images. However, the algorithms mentioned above need to extract information regarding the characteristics of the wake directly from SAR images, which inherently contain speckle noise and waves. In addition, their aim is to restore the wake area as accurately as possible to establish precise bounding boxes. The task possesses a certain level of complexity without necessitating additional interventions.

In a previous study, Yuan et al. [12] employed a jointly optimized semantic-aware self-supervised denoising method to enhance the performance of the segmentation task. Similarly, Li et al. [23] integrated dehazing and detection tasks into a unified network, introducing a union architecture and a novel robust loss function. Nevertheless, limited research has been conducted to evaluate the effectiveness of enhanced SAR images following restoration processes (such as denoising or wave suppression) on downstream tasks. Building on [23], we hypothesize that the denoising process may impact the detection results of ship wakes.

To address the disparity between high-level SAR ship wake detection and low-level image denoising, we will examine the framework’s overall design and the choice of denoising methods. Fig. 1(a) illustrates the process of delivering the unprocessed noisy image to the detector for training and testing. This approach is often used in prior SAR detection algorithms, and there is significant potential for improvement. Fig. 1(b) illustrates a type of jointly optimized framework, represented by Yuan et al. [12] and Li et al. [23]. This approach enhances the performance of image restoration and downstream tasks simultaneously. However, it necessitates a distinct design of the loss function or adding a new module to establish a connection between the two tasks. The third approach, as depicted in Fig. 1(c), employs a two-stage procedure where the detection task and image denoising are performed independently. The benefit of this approach lies in the fact that the refined restoration algorithm can enhance the performance of the detection task to a certain degree. In addition, there are few denoising algorithms available for SAR ship wake. Given different detection algorithms, we will employ a two-stage strategy to design the framework, facilitating subsequent experimental validation. When choosing denoising algorithms, gathering enough clean-noisy image pairs for supervised algorithms that meet the criteria for real SAR wake images is challenging. Therefore, we will explore using self-supervised denoising algorithms that rely on deep learning. These algorithms offer superior generalization capabilities and end-to-end integration compared to conventional methods, such as SAR-BM3D [24], LMMSE-Wavelet [25], and POTDF [26].

Fig. 1. - Three different methods for SAR ship wake detection, where OBB denotes the oriented bounding box, a type of bounding box that is aligned with the object’s orientation. (a) Detection model is trained directly on the noisy SAR image, and the results are outputted. (b) SAR image denoising and wake detection tasks are performed simultaneously in a single network, and the detection results with the denoising effect are output directly. (c) Denoising model processes the noisy SAR image and then inputs the detector to train and output the results.
Fig. 1.

Three different methods for SAR ship wake detection, where OBB denotes the oriented bounding box, a type of bounding box that is aligned with the object’s orientation. (a) Detection model is trained directly on the noisy SAR image, and the results are outputted. (b) SAR image denoising and wake detection tasks are performed simultaneously in a single network, and the detection results with the denoising effect are output directly. (c) Denoising model processes the noisy SAR image and then inputs the detector to train and output the results.

To this end, we propose a self-supervised denoising framework named Wake2Wake. This framework can improve the effectiveness of current mainstream-oriented SAR ship wake detectors. To effectively exploit the Kelvin wake and turbulence wake characteristics in SAR images, we have developed a ship wake awareness (SWA) block that concentrates on these two distinct wakes. The module aims to eliminate background waves and noise while enhancing the visibility of the wakes as foreground elements. As a result, we have implemented a more efficient data integration strategy to train Wake2Wake. This strategy involves combining real SAR wake data with simulated data, resulting in improved training efficiency and denoising effectiveness to some extent. These simulated SAR images effectively enhance the singularity and repetitiveness of ship wakes in certain real scenarios. They are well-balanced in terms of radar parameters, sea state parameters, and ship parameters compared to real images. Multiple extensive comparisons and ablation experiments provide evidence that our suggested technology can enhance the performance of current SAR wake detectors to a certain degree. The SAR ship wake dataset that we collected, including rotation detection labels, can be accessed at (https://github.com/libzzluo/OpenSARWake).

SECTION II.

Proposed Method

This section introduces a feature-guided self-supervised wave suppression module named Wake2Wake. The method takes SAR images containing waves or speckle noise as input aims to produce output images with minimal waves or noise. Wake2Wake can be integrated with various oriented detectors to create a comprehensive framework, resulting in enhanced performance for wake detection using deep learning. This section will provide explicit details.

A. Overall Network Overview

Given that our primary objective is to enhance the performance of current SAR ship wake detectors, we do not allocate effort to create noisy-clean image pairs for supervised denoising learning. Instead, we employ unsupervised denoising methods. Current mainstream unsupervised denoising algorithms primarily utilize pixel masking strategies on input images. This approach requires the network to reconstruct the masked pixels using information from neighboring pixels. Representative examples of such algorithms are NBR2NBR [27] and N2V [28], effectively alleviating the reliance on identity mapping. Nevertheless, these blindspot schemes also suffer from the disadvantage of information loss, as the network cannot perceive the information from the masked pixels. For these reasons, we decided to employ the B2UB [29] network. This network utilizes the original noisy image more efficiently compared to the NBR2NBR and N2V approaches. In addition, it can utilize information from all input pixel points, theoretically preventing any loss of information.

Nevertheless, the level of noise pollution in SAR images surpasses that in optical images. When B2UB is trained exclusively on SAR ship wake images without modification, the denoising results are strikingly inadequate, as demonstrated by our research. The complexity of the sea clutter and noise distribution in SAR ship wake images makes it challenging to perform direct self-supervised denoising of the entire dataset. However, we drew insights from the advancements in the SAR image task by Yuan et al. [12]. They argue that the application of max pooling in the U-Net architecture of the original self-supervised algorithm is suboptimal, as it tends to generate speckle noise residues in the low-frequency area of the SAR image. This aspect is crucial to consider while addressing ship wake characteristics. Thus, we first modify the training strategy of the self-supervised denoising network and enhance all convolution operations in the original U-Net structure. We also provide an SWA block concentrating on Kelvin wake and turbulent wake characteristics. We refer to this enhanced framework as Wake2Wake.

After the denoised SAR ship wake images are produced by the Wake2Wake network, they are utilized as input for training and testing the ship wake detector. In Section III-B, we analyze the performance variations of Wake2Wake when it is employed on the target detection dataset in the training/validation set, the testing set, and the entire dataset, respectively. In Sections II-B and II-C, we provide the underlying construction principle of the SWA block and a more comprehensive explanation of the precise algorithmic flow of Wake2Wake. To enhance comprehension, we present the pseudo-code for the entire framework flow in Algorithm 1. In Section II-B, we will introduce methods to enhance the original U-Net model, addressing SAR ship wake feature extraction requirements more effectively.

Algorithm 1 - Training and Inference Strategy of the Proposed Wake2Wake Denoising Framework and SAR Ship Wake Detector
Algorithm 1

Training and Inference Strategy of the Proposed Wake2Wake Denoising Framework and SAR Ship Wake Detector

B. SWA Block

In the broader realm of computer vision, in-context learning [30] has shown notable success in tasks such as image denoising, image enhancement, and image deduplication. Notably, these achievements have been realized without the need for parameter tuning. It seems that refining the structure of CNN does not result in significant advancements in these general domains. However, given the distinct features of the ship wake, a comprehensive solution has yet to emerge in the realm of SAR ship wake recognition. Therefore, we enhance the U-Net architecture utilized in the original B2UB network by considering the unique characteristics of SAR ship wakes. We introduce an SWA block to selectively replace the original convolution operation. Subsequently, we evaluate its Lipschitz constant [31] in comparison to other commonly used attention blocks to assess its generalization capability.

Typically, ship wakes frequently appearing in SAR images consist mainly of turbulent wakes and Kelvin wakes. Our focus is on describing these two types of wakes. Xue et al. [32] employed the frequency channel attention convolution (FcaConv) [33] to investigate the characteristics of Kelvin wakes in optical ship wakes, examining both their image and frequency domains. The experimental results provided evidence that the module effectively captures the optical Kelvin wakes. However, when using the FcaConv with our proposed OpenSARWake [34], we observe that selecting only the discrete cosine transformation (DCT) bases of the top 32 leads to a minor improvement in detection accuracy. Conversely, selecting other frequency components results in a decrease in the original detection accuracy. The absence of Kelvin wakes in SAR images is a common occurrence, with only a turbulent wake, referred to as a black streak, being typically evident. Furthermore, the presence of sea clutter in optical images is usually represented by a Gaussian distribution. However, sea clutter in SAR images is considerably more complicated, posing challenges in differentiating low-frequency ship wakes from the sea clutter. Hence, we opted for an 8× DCT basis to rescale the features of the Kelvin wake. In addition, recognizing the imperative need for a mechanism that focuses on the black streak-like turbulent wake, we draw inspiration from the successful outcomes achieved by applying elongated tubular convolution structures in the analysis of the digital retinal images for vessel extraction (DRIVE) retina dataset [35] and the Massachusetts Roads dataset [36]. These datasets exemplify the efficacy of such structures in identifying and analyzing linear features across diverse applications, from retinal vessel segmentation to road detection in urban and rural areas. Qi et al. [37] propose employing the dynamic snake convolution (DSConv) method to enhance the U-Net structure’s ability to specifically identify the structural characteristics of the turbulent wake with a similar elongated streak. The ultimate architecture of our proposed SWA block is depicted in Fig. 2(a). The modified U-Net structure benefits from the combined application of DSConv and the FcaConv, enabling it to effectively capture the characteristics of both the turbulent wake and the Kelvin wake, unlike the conventional conv block. Initially, the input image undergoes a DSConv operation to extract the features of the turbulence wake. Unlike standard convolution, the position of each grid in DSConv is not a regular square. Taking the x-direction as an example, we represent the position of each grid in DSConv as Ki±c=(xi±c,yi±c) , where c={0,1,2,3,4} denotes the horizontal position distance of each grid from the center grid. The location selection of the following grid from the center grid is a cumulative operation. Therefore, the entire convolutional grid along the x-direction can be represented as

Ki±c={(xi+c,yi+c)=(xi+c,yi+Σi+ciΔy)(xic,yic)=(xic,yi+ΣiicΔy).(1)
View SourceRight-click on figure for MathML and additional features.

Fig. 2. - Illustration of our proposed Wake2Wake framework. (a) Architecture of our proposed SWA block. (b) Feature-guided self-supervised denoising training and inference process. The details regarding the structure mentioned in this figure will be provided in Sections II-B and II-C, correspondingly.
Fig. 2.

Illustration of our proposed Wake2Wake framework. (a) Architecture of our proposed SWA block. (b) Feature-guided self-supervised denoising training and inference process. The details regarding the structure mentioned in this figure will be provided in Sections II-B and II-C, correspondingly.

The convolution grid along the y-direction can be represented as

Kj±c=(xj+c,yj+c)=(xj+Σj+cjΔx,yj+c)(xjc,yjc)=(xj+ΣjjcΔx,yjc).(2)
View SourceRight-click on figure for MathML and additional features.

Typically, Δx and Δy both have decimals, necessitating bilinear interpolation operations

K=ΣKI(K,K)K(3)
View SourceRight-click on figure for MathML and additional features. where K denotes the fractional location in (1) and (2), I is the bilinear interpolation kernel, and K enumerates all the integer spatial locations. I is separated into two 1-D kernels, namely,
I(K,K)=I(Kx,Kx)I(Ky,Ky).(4)
View SourceRight-click on figure for MathML and additional features.

Subsequently, the feature map after DSConv, denoted as XRH×W×C , is spectrally filtered by DCT. It is categorized into n groups along the channel dimension, with the number of channels in each group as (c/n) (where C is divisible by n). The output of the DCT for the different frequencies is

 freqk=DCT(Xk)=i=0H1j=0W1Bki,jXki,j,:(5)
View SourceRight-click on figure for MathML and additional features. where H and W represent the size of the feature map. Xk and Xki,j,: represent the kth group input feature maps and the Xk element at position (i and j), respectively. Bki,j is the selected DCT base for the kth group element. After obtaining the 2-D DCT outputs, the feature map Xn can be rescaled
 weight=sigmoid(fc(concat×([freq0,freq1,,freqn1]))).(6)
View SourceRight-click on figure for MathML and additional features.

Finally, Xn is multiplied with weight through the Hadamard product to get the final guided feature map

X~=weightX.(7)
View SourceRight-click on figure for MathML and additional features.

In our practical experiments, we replaced all the plain convolutions in the original U-Net with SWA_block, and the final improved U-Net structure is shown in Table I.

TABLE I Well-Designed U-Net Network Architecture in Our Experiments. In Order to Address the Speckle Noise Residuals in the Low-Frequency Area of the SAR Image During Denoising, We Substituted the Original Maxpool With Avgpool. The Structure of the Swa_Block Is Further Explained in Section II-B
Table I- Well-Designed U-Net Network Architecture in Our Experiments. In Order to Address the Speckle Noise Residuals in the Low-Frequency Area of the SAR Image During Denoising, We Substituted the Original Maxpool With Avgpool. The Structure of the Swa_Block Is Further Explained in Section II-B

To quantitatively analyze the robustness of our suggested SWA_block, we computed its Lipschitz constant and compared it with other common attention blocks. While accurately determining the Lipschitz constant of a multilayer network is a nondeterministic polynomial-time hard (NP-hard) problem, we can make an approximation by calculating an upper bound on the Lipschitz constant of the network. For instance, a convolutional block in a typical ResNet can be represented as

f(x)=W1ϕ(W0x+b0)+b1(8)
View SourceRight-click on figure for MathML and additional features. where x is an input to this block. The weight matrix and bias vector of the two neighboring convolutional layers are W0 , b0 and W1 , b1 , respectively. ϕ represents the concatenation of activation functions, where ϕ(x)=[φ(x1),,φ(xn)] .

For all x, y R , setting the Lipschitz constant of the block to be L , its Lipschitz condition is expressed as

f(x)f(y)2Lxy2.(9)
View SourceRight-click on figure for MathML and additional features.

Squaring both sides of the inequality in (9) gives

(f(x)f(y))(f(x)f(y))L(xy)(xy).(10)
View SourceRight-click on figure for MathML and additional features.

According to the slope-restricted nonlinearity definition in [38], the function φ is slope-restricted on [α,β] , where 0α<β< if

αφ(y)φ(x)yxβx,yR.(11)
View SourceRight-click on figure for MathML and additional features.

To understand the connection between incremental quadratic constraints and slope-restricted nonlinearity, (11) can equivalently be expressed as the following single inequality:

(φ(y)φ(x)yxα)(φ(y)φ(x)yxβ)0.(12)
View SourceRight-click on figure for MathML and additional features.

By multiplying by (yx)2 and rearranging, we can express (12) in the following form:

[xyφ(x)φ(y)][2αβα+βα+β2][xyφ(x)φ(y)]0.(13)
View SourceRight-click on figure for MathML and additional features.

The definition of TTn can be found in [38], supposing there exists L>0 such that the following symmetric matrix inequality holds:

M(L,T):=[2αβW0W0LIn(α+β)TW0(α+β)W0T2T+W1W1]0(14)
View SourceRight-click on figure for MathML and additional features. where In is the n-dimensional identity matrix. At this point, (13) is always satisfied. Hence, when M(L,T)0 , the ResBottleneck will satisfy the Lipschitz condition, taking L as the Lipschitz constant. This NP-hard problem can be transformed into the following semidefinite programming problem (SDP):
min L, s.t. M(L,T)0 and TTn.(15)
View SourceRight-click on figure for MathML and additional features.

Eventually, we employed the LipSDP [38] algorithm to approximate the Lipschitz constants of various common attention blocks and determine the computational floating point operations per second (FLOPS), as depicted in Fig. 3.

Fig. 3. - Comparison of Lipschitiz constants in different attention modules. The size of the radius of the circular symbols is proportional to the computed FLOPS of the module. It is important to note that a larger Lipschitiz constant does not necessarily imply more robustness; on the contrary, a larger Lipschitiz constant will make the training of the module more unstable. Therefore, Lipschitiz constants are used here solely for theoretical analysis of their role in the module.
Fig. 3.

Comparison of Lipschitiz constants in different attention modules. The size of the radius of the circular symbols is proportional to the computed FLOPS of the module. It is important to note that a larger Lipschitiz constant does not necessarily imply more robustness; on the contrary, a larger Lipschitiz constant will make the training of the module more unstable. Therefore, Lipschitiz constants are used here solely for theoretical analysis of their role in the module.

C. Self-Supervised Denoising Framework Wake2Wake

We selected B2UB as the baseline method for several reasons. First, it is a self-supervised denoising technique that only requires a noisy SAR image. Second, it effectively utilizes all the information in the original noisy image. Finally, it applies a more stable Re-visible loss function throughout the training phase. However, most self-supervised denoising methods were initially developed for optical images [39], [40]. These methods do not consider the distinctive features of SAR sea images, particularly ship wakes such as Kelvin and dark streak-like turbulent wakes. Waves and sea clusters often obscure these wakes. In Sections II-A and II-B, we outline the enhancements that we have implemented to achieve this objective.

Our primary objective is to integrate the trained Wake2Wake model into the OBB detector. Typically, we use the entire detection dataset as inputs for the denoising network. However, challenges arise in simultaneously learning all noise distributions in the detection dataset without increasing the network’s capacity, often necessitating refined parameterization. A comprehensive explanation of the collection and preprocessing of the OpenSARWake dataset is provided in Section III-A. To address this, we propose utilizing a smaller yet more feature-enriched set of training samples. This subset comprises 65 carefully selected SAR images featuring prominent wind waves and wakes from the original OpenSARWake dataset. In addition, we include 180 simulated SAR images generated using a total of 60 sea conditions with the full-link ocean surface SAR imaging simulator [41], [42]. This simulation covers five distinct wind speeds ranging from 5.5 to 13.5 m/s and 20 wind directions at intervals of 30°. To ensure a thorough evaluation, three different bands of SAR parameters were employed, resulting in diverse and extensive characterizations. For further details on the acquisition and generation principles of this specific data subset, please refer to Section III-A.

Hence, the input of our newly designed Wake2Wake network is represented by {yn,n=1,2,,245} , and the dimensions of the images are uniformly set to 1024× to maintain a wide range of intricate features. As depicted in Figs. 2(b) and 4, each yn is initially transformed via the Global Masker Θ() to produce four masked image patches (θ00I,θ01I,θ10I,θ11I) . The procedure is given as follows.

  1. The input image yn is partitioned into W/s× separate cells, each with a size of s× . For our experiment, we fixed the W and H values to 1024. When is greater than 2, the computational load of the masking operation increases geometrically; therefore, s is set to 2 in this article, as presented in Fig. 4.

  2. Before being submitted to the Global Mask Mapper M() for sampling, (θ00I,θ01I,θ10I,θ11I) must be organized into a masked volume θy and processed by our modified U-Net model FΩ() and Global Mask Mapper M() to obtain M(FΩ(y)) .

  3. In another sequence, the initial noisy image yn is immediately fed into the FΩ() to obtain the denoised output FΩ(y) .

Fig. 4. - Implementation detail of the Global Mask 
$\mathrm {\theta }(\cdot)$
 and Global Mask Mapper 
$\mathcal {M}(\cdot)$
. In the experiments of this article, 
$2 \times $
 mask cells are used throughout.
Fig. 4.

Implementation detail of the Global Mask θ() and Global Mask Mapper M() . In the experiments of this article, 2× mask cells are used throughout.

The loss function is defined as follows during the training process of Wake2Wake:

LrevLregL=M(FΩ(y))+λF^Ω(y)(λ+1)y22=M(FΩ(y))y22=Lrev+ηLreg.
View SourceRight-click on figure for MathML and additional features.

The hyperparameter η is a constant governing the initial impact of the blind term and the stability of the training process. In contrast, the hyperparameter λ is a variable regulating the strength of visible components during the conversion from blind to unblind. The starting and ending values of λ are denoted as λ1 and λ2 , respectively. We determined the values of α , λ1 , and λ2 through empirical observation, setting η to 1, λ1 to 2, and λ2 to 20.

Following the completion of Wake2Wake training, the denoised OpenSARWake dataset is employed to train various types of OBB detectors. For further details on this process, please refer to the pseudocode in Algorithm 1.

SECTION III.

Experimental Results and Discussion

A. Details of Datasets and Implementation

To validate the effectiveness of our proposed Wake2Wake method, it is imperative to train and evaluate it using an extensive dataset of SAR ship wakes. However, there is a scarcity of publicly available benchmarks in this domain. Table II demonstrates that Del Prete et al. [22] and Ding et al. [8] have gathered two SAR ship wake datasets of a certain scale in the C-band. However, their datasets are limited since they only cover the C-band and are inaccessible to the public. Consequently, the available SAR ship wake images are insufficient to meet the requirements of data-driven deep learning networks. In order to facilitate Wake2Wake training and subsequent deep learning-based SAR ship wake detection, we constructed an OpenSARWake dataset. Table II demonstrates that we considered the diversity of ship wake characteristics in various frequency bands, including X-, C-, and L-bands. To accomplish this, we gathered SAR images with ship wakes from these bands from offshore regions, as depicted in Fig. 5(a). This facilitated the collection of the related automatic identification system (AIS) data, which can be utilized for potential subsequent investigations. Fig. 5(b)–(e) illustrates the statistical distribution of our proposed OpenSARWake dataset. These images were obtained from https://search.asf.alaska.edu/ and https://earth.esa.int/eogateway/catalog/.

TABLE II Published Datasets in SAR Ship Wake Detection Community. HBB: Horizontal Bounding Box; OBB: Oriented Bounding Box; and PMA: Polygon Mask Annotation
Table II- Published Datasets in SAR Ship Wake Detection Community. HBB: Horizontal Bounding Box; OBB: Oriented Bounding Box; and PMA: Polygon Mask Annotation
Fig. 5. - (a) Geographic location and temporal distribution of the collected OpenSARWAke dataset applied in our study. (b)–(e) Statistical distribution of the instances in the OpenSARWake dataset.
Fig. 5.

(a) Geographic location and temporal distribution of the collected OpenSARWAke dataset applied in our study. (b)–(e) Statistical distribution of the instances in the OpenSARWake dataset.

We collected ground-range detected (GRD) products for the C-Band Sentinel and H2.2-level products for the L-Band ALOS PALSAR images. For X-Band TerraSAR-X images, we primarily used single-look complex (SLC) products. To maintain consistent quality, we utilized the SNAP software to process the data and generate geocoded ellipsoid-corrected (GEC) products. In addition, essential preprocessing techniques, such as the adaptive contrast algorithm, were applied, and the image was ultimately resized to dimensions of 1024× . Expert labeling ensured meticulous annotations of all rotated bounding boxes for reliability. To validate the annotations for the ship wakes acquired by ALOS-PALSAR and Sentinel-1A near Denmark, we used AIS data obtained from https://web.ais.dk/aisdata/, and the visual inspection was used for the remaining images. The OpenSARWake dataset consists of 653 C-band images, 299 X-band images, and 2221 L-band images. In addition, there were 231 images containing Kelvin wake and 2841 images containing turbulent wake. A random selection process allocated 60% of the total images to the training set, while 20% were assigned to both the validation and test sets. During the training of the Wake2Wake component, it was observed that using the complete OpenSARWake training set resulted in significantly unsatisfactory denoising effects, rendering it almost ineffective on the original SAR images. The presence of sea wave interference and other oceanic phenomena, along with speckle noise, posed challenging in denoising and wave depression, surpassing those encountered in optical benchmarks such as BSD300 [43] and Kodak [44]. Therefore, establishing a more efficient and representative feature space became necessary to enhance the denoising effect of Wake2Wake.

Not all sea surface SAR images exhibit significant wave activity in real life. However, these images typically show distinct turbulence wakes with greater clarity. Thus, it is expected that Wake2Wake can emphasize denoising and effectively suppress waves to enhance the detection performance of SAR ship wakes that are affected by significant wave interference. While constructing the feature space, namely, the data distribution, for training Wake2Wake, we deliberately selected 65 SAR images with clearly visible waves and ship wakes from the OpenSARWake dataset. The OpenSARWake collection comprises 408 scenes of SAR images from ALOS PALSAR, 120 scenes of images from Sentinel 1, and only 55 scenes of images from TerraSAR-X. It is important to note that ship wakes in different bands have their own unique characteristics. Therefore, if there are too few or too many images in a particular band, it may cause the denoiser to learn biased wake features. However, this issue can be mitigated to some extent by adding simulated SAR wake images. In addition, we integrated 180 simulated SAR ship wake images. Commonly observed ship wake types in SAR images include Kelvin wake, turbulence wake, internal wave wake, and narrow-V wake [45]. Owing to their infrequency, we have excluded the internal wave wake and narrow-V wake from our simulation. The Kelvin wake refers to a modeling method based on the potential flow theory proposed by [46]. Based on this method, we also consider the sea surface motion, the time-varying characteristics of the scattering units, and the decoherence effect of the sea surface in the echo simulation. The simulator used for this task can be found at https://github.com/SYSUSARSimu/KWFull-Link_SARSim. To model the turbulent wake, we employ our simulation strategy [41] based on the energy spectrum balance equation. This method has the additional advantage of accurately simulating the turbulent wake across a distance that could exceed kilometers. We simulate SAR images in three frequency bands: X-band, C-band, and L-band. The choice of SAR parameters is guided by that used in [47]. To enhance the variety of the simulated ship wakes, we utilize five different wind speeds ranging from 5.5 to 13.5 m/s and 12 wind directions with intervals of 30° for the sea surface simulation. The simulation approach employed for the sea surface is based on a full-link ocean surface SAR imaging simulator [42]. Fig. 4 demonstrates the application of the unified manifold approximation and projection (UMAP) method to analyze the OpenSARWake dataset and our proposed mixed datasets. All the SAR images were projected into a 3D space, revealing a wide distribution in the manifold space. In Fig. 6(a), the cluster of feature points on the right-hand side indicates that most wake images share similar characteristics. In Fig. 6(b), the red and blue points represent the features of a selected subset of 65 OpenSARWake wake images and 180 simulated wake images, respectively. Although the total number of these images is smaller than the complete OpenSARWake dataset, they still occupy a similar space in the manifold distribution. This highlights the richness of the characteristics in the simulated wake images. Thus, to summarize, we employ a small-scale mixed-source SAR ship wake dataset to train Wake2Wake. However, the training and testing datasets utilized in our comprehensive framework comprise the whole OpenSARWake dataset.

Fig. 6. - Effects of the datasets’ samples on manifold distributions. (a) OpenSARWake dataset. (b) Proposed multisource dataset.
Fig. 6.

Effects of the datasets’ samples on manifold distributions. (a) OpenSARWake dataset. (b) Proposed multisource dataset.

B. Comparison With State-of-the-Art Methods

To evaluate the performance disparity between our proposed Wake2Wake method and other denoising algorithms on various detectors, we specifically selected two categories of SAR image denoising algorithms for comparison. These include traditional methods such as LMMSE-Wavelet [25], SAR-BM3D [24], FANS [48], POTDF [26], and PPB [49], as well as deep learning-based methods such as SAR-CNN [50], NBR2NBR [27], and B2UB [29]. For the rotated detectors, we selected anchor-free methods such as Rotated RepPoints [51], single-stage methods such as S2A-Net [52], and two-stage methods such as Oriented R-CNN [53] for cross-comparison. We selected an equivalent number of looks (ENL), peak signal-to-noise ratio (PSNR), and structural similarity index (SSIM) for their recognized efficacy in similar SAR despeckling tasks, evaluating homogeneity, reconstruction quality, and structural integrity, respectively. These metrics are complemented by recall and mean average precision (mAP) to assess the overall impact on detection performance. While other metrics such as TCR [55] and EPD-ROA [56] were considered, we prioritized those with a direct, established correlation to despeckling effectiveness and downstream detection capabilities. The experiments were conducted using NVIDIA A6000 GPUs. For the denoising algorithms other than Wake2Wake, we used the same hyperparameters as specified in their respective original papers. For the training process of the Wake2Wake algorithm, the initial learning rate was set to 3e4 , with a weight decay of 1e8 . We trained the model for 200 epochs with a batch size of 2, using the Adam optimizer for learning rate updates. Regarding the detection algorithms Rotated RepPoints [51], S2A-Net [52], and Oriented R-CNN [53], their training hyperparameters can be referred to in [34]. Notably, despite our Wake2Wake approach, all other methods were trained and evaluated using the complete OpenSARWake dataset.

The comparative analysis of several denoising algorithms on OpenSARWake is presented in Table III. The ENL is frequently employed to quantify the speckle suppression capabilities of various SAR image filters. A higher ENL value indicates effective smoothing. Traditional methods, POTDF and PPB, exhibit significantly higher ENL values than deep learning methods. However, we argue that this does not necessarily enhance the accuracy of ship wake detection. Table IV shows that the higher ENL values of POTDF and PPB decrease their mAP on detection algorithms, especially for Rotated RepPoints, which is an anchor-free object detector. Nevertheless, both methods enhance their detection recall when employing Type III training strategies. Furthermore, our examination of the ENL values of the deep learning algorithms did not reveal significant regularities.

TABLE III Denoising Effects and Quantitative Comparison of the Different Denoising Methods on the OpenSARWake Dataset. The Performance for the Training, Validation, and Test Sets Is Shown, Respectively. Bold and Underline Indicate the Best Results
Table III- Denoising Effects and Quantitative Comparison of the Different Denoising Methods on the OpenSARWake Dataset. The Performance for the Training, Validation, and Test Sets Is Shown, Respectively. Bold and Underline Indicate the Best Results
TABLE IV Comparisons of Detection Performance on the OpenSARWake Test Set. Type I Indicates Denoising Processing Applied Only to the Train/Validation Set, While Type II Signifies Application Solely to the Test Set. Type III Denotes Denoising of the Entire Dataset. The Results in This Table Represent Detection Recall and mAP, Respectively. Bold and Underline Indicate the Best Results
Table IV- Comparisons of Detection Performance on the OpenSARWake Test Set. Type I Indicates Denoising Processing Applied Only to the Train/Validation Set, While Type II Signifies Application Solely to the Test Set. Type III Denotes Denoising of the Entire Dataset. The Results in This Table Represent Detection Recall and mAP, Respectively. Bold and Underline Indicate the Best Results

PSNR and SSIM quantify the image quality by comparing the maximum signal to background noise and by evaluating image similarity before and after denoising, respectively. Similarly, in conventional methods, higher PSNR and SSIM values do not necessarily indicate higher mAP values. In the case of B2UB and Wake2Wake, slightly higher PSNR and SSIM values result in higher recall and mAP values compared to other deep learning methods. However, these methods are constrained by the limitations of their training strategy, as mentioned in Section III-A, which restricts their potential. Our proposed Wake2Wake method surpasses all other denoising algorithms on the detection accuracy of Rotated RepPoints and S2A-Net. This is achieved through an enhanced training strategy and the utilization of a customized SWA block designed specifically for ship wake features. Notably, our method significantly outperforms the results obtained without the application of a denoising algorithm. Currently, the consensus is that the application of traditional denoising algorithms tends to decrease the accuracy of all the detection methods listed in Table IV. This is primarily due to the fact that traditional denoising algorithms rely on a fixed approach to image processing, which lacks the capacity to adapt and generalize well. In addition, these algorithms are incapable of retaining and acquiring information regarding the characteristics present in the data. Furthermore, if not enhanced, the denoising techniques that rely on deep learning also diminish their effectiveness on the aforementioned three categories of detection algorithms. Overfitting and incomplete feature learning can negatively impact the model’s performance on the detector. In addition, disregarding ship wake features may reduce the effectiveness of the method. Our proposed Wake2Wake algorithm, built upon the research of previous scholars, addresses these issues.

Furthermore, we provide a visual comparison of the denoising and detection results from a test sample image, as depicted in Figs. 7–​9. It is evident that the conventional methods for suppressing waves and noise lack stability, as both the POTDF and PPB algorithms exhibit some degree of oversmoothing in Fig. 7. Conversely, the deep learning algorithms demonstrate a certain level of overfitting. In contrast, our Wake2Wake algorithm effectively suppresses waves while preserving wake features, leading to improved accuracy in bounding-box localization, as demonstrated in Fig. 9.

Fig. 7. - Visual comparison of SAR image denoising results for the OpenSARWake test sample image using different methods. The image used was acquired through the C-band Sentinel-1A satellite. It is of the English Channel and is a ground-range detected (GRD) product acquired in StripMap mode with a pixel spacing of 10 m. The wind velocity is 6.7 m/s and the wind direction is 74.4°, in which counting goes from the azimuth direction clockwise. The sea state information was derived from the ERA5 hourly data on single levels [54]. (a) Nosiy SAR image. (b) LMMSE-Wavelet. (c) SAR-BM3D. (d) FANS. (e) POTDF. (f) PPB. (g) SAR-CNN. (h) NB2NB. (i) B2UB. (j) Wake2Wake.
Fig. 7.

Visual comparison of SAR image denoising results for the OpenSARWake test sample image using different methods. The image used was acquired through the C-band Sentinel-1A satellite. It is of the English Channel and is a ground-range detected (GRD) product acquired in StripMap mode with a pixel spacing of 10 m. The wind velocity is 6.7 m/s and the wind direction is 74.4°, in which counting goes from the azimuth direction clockwise. The sea state information was derived from the ERA5 hourly data on single levels [54]. (a) Nosiy SAR image. (b) LMMSE-Wavelet. (c) SAR-BM3D. (d) FANS. (e) POTDF. (f) PPB. (g) SAR-CNN. (h) NB2NB. (i) B2UB. (j) Wake2Wake.

Fig. 8. - Visual comparison of SAR image denoising results for the OpenSARWake test sample image using different methods. The image used was acquired through the C-band Sentinel-1A satellite. It is of the English Channel and is a ground-range detected (GRD) product acquired in StripMap mode with a pixel spacing of 10 m. The wind velocity is 8.4 m/s and the wind direction is 255.5°, in which counting goes from the azimuth direction clockwise. The sea state information was derived from the ERA5 hourly data on single levels [54]. (a) Nosiy SAR image. (b) LMMSE-Wavelet. (c) SAR-BM3D. (d) FANS. (e) POTDF. (f) PPB. (g) SAR-CNN. (h) NB2NB. (i) B2UB. (j) Wake2Wake.
Fig. 8.

Visual comparison of SAR image denoising results for the OpenSARWake test sample image using different methods. The image used was acquired through the C-band Sentinel-1A satellite. It is of the English Channel and is a ground-range detected (GRD) product acquired in StripMap mode with a pixel spacing of 10 m. The wind velocity is 8.4 m/s and the wind direction is 255.5°, in which counting goes from the azimuth direction clockwise. The sea state information was derived from the ERA5 hourly data on single levels [54]. (a) Nosiy SAR image. (b) LMMSE-Wavelet. (c) SAR-BM3D. (d) FANS. (e) POTDF. (f) PPB. (g) SAR-CNN. (h) NB2NB. (i) B2UB. (j) Wake2Wake.

Fig. 9. - Detection results on the OpenSARWake test sample image by different denoising and detection methods with (a) corresponding to Fig. 7 and (b) corresponding to Fig. 8. Here, the GT labels represent the ground-truth labels.
Fig. 9.

Detection results on the OpenSARWake test sample image by different denoising and detection methods with (a) corresponding to Fig. 7 and (b) corresponding to Fig. 8. Here, the GT labels represent the ground-truth labels.

C. Ablation Studies

In this section, we conducted ablation experiments to investigate the influence of the dataset used to train Wake2Wake and the convolution operation applied in the SWA block to focus on two distinct ship wake features in the final detection results. The experiments followed the Type III strategy outlined in Table IV, where the training, validation, and test sets for training the OBB detector were all denoised.

Table V(a) indicates that the inclusion of 180 simulated wake images has a certain impact on the accuracy of detection. These real wake images enable Wake2Wake to concentrate on learning and suppressing wave features. We hypothesize that superior results could have been achieved with real wake images exhibiting more diverse sea conditions. Table V(b) outlines the impact of various convolution operations on wave suppression and detection results in our SWA block. Correspondingly, Fig. 10 provides a visual comparison of the results mentioned in Table V(b). The texture of the suppressed sea waves remains distinctly visible in the absence of either FcaConv or DSConv, as depicted in Fig. 10(b) and (g). When FcaConv is employed, the spectral characteristics in Fig. 10(h) indicate a slight increase in wave scale. This can be attributed to the absence of a significant Kelvin transverse wave, with only a divergent wave present on the left side. Consequently, FcaConv proves to have a limited contribution to this particular test sample. However, when both FcaConv and DSConv are employed, as depicted in Fig. 10(j), it is evident that the texture of the waves is effectively alleviated. This indicates that the simultaneous application of these two convolutions yields beneficial effects, aligning with the observation that both types of ship wakes are present in this figure simultaneously. Furthermore, to visually illustrate the wave components removed in Fig. 10(e), we present in Fig. 11 the wave images suppressed by our proposed SWA block and the corresponding image spectrum. It can be observed that although a small portion of the wakes is suppressed along with the waves, the majority of the removed components are sea clutter.

TABLE V Effect of Different Datasets Used to Train Wake2Wake and Various Convolution Operations Applied in SWA Block on Detection Performance. The Results in This Table Represent Detection Recall and mAP, Respectively. (a) Comparison of the Effect of Different Sources of Dataset to Train Wake2Wake. Here, OpenSARWake* Represents the 65 Images Selected From the Full OpenSARWake Dataset. (b) Comparison of the Effect of Different Convolution Operation Utilized in Our Proposed SWA Block to Train Wake2Wake
Table V- Effect of Different Datasets Used to Train Wake2Wake and Various Convolution Operations Applied in SWA Block on Detection Performance. The Results in This Table Represent Detection Recall and mAP, Respectively. (a) Comparison of the Effect of Different Sources of Dataset to Train Wake2Wake. Here, OpenSARWake* Represents the 65 Images Selected From the Full OpenSARWake Dataset. (b) Comparison of the Effect of Different Convolution Operation Utilized in Our Proposed SWA Block to Train Wake2Wake
Fig. 10. - Comparison of the effect of FcaConv and DSConv, which are utilized in our proposed SWA block. The first row shows the results after denoising with different convolution operations, and the second row shows the corresponding SAR image spectrum. (a) and (f) Noisy SAR image. (b) and (g) Effect of plain convolution. (c) and (h) Effect of FcaConv. (d) and (i) Effect of DSConv. (e) and (j) Effect of our proposed SWA block.
Fig. 10.

Comparison of the effect of FcaConv and DSConv, which are utilized in our proposed SWA block. The first row shows the results after denoising with different convolution operations, and the second row shows the corresponding SAR image spectrum. (a) and (f) Noisy SAR image. (b) and (g) Effect of plain convolution. (c) and (h) Effect of FcaConv. (d) and (i) Effect of DSConv. (e) and (j) Effect of our proposed SWA block.

Fig. 11. - Sea clutter components removed by our proposed SWA block in Fig. 10, where (a) is the sea clutter image and (b) is the corresponding image spectrum.
Fig. 11.

Sea clutter components removed by our proposed SWA block in Fig. 10, where (a) is the sea clutter image and (b) is the corresponding image spectrum.

SECTION IV.

Conclusion

This study introduces Wake2Wake, a feature-guided self-supervised method for suppressing waves in SAR ship wake detection. Our proposed Wake2Wake utilizes a novel self-supervised training strategy. Instead of training directly on the full-scale dataset, we selected 65 real SAR images with the most prominent wave textures from the dataset and combined them with 180 simulated wake images. The simulated SAR images generated by our previous work exhibit a balanced and ergodicity selection of radar parameters, sea conditions, and ship parameters, effectively compensating for the feature singularity of real SAR images in some scenarios. The results indicate that this training strategy produces increased detection accuracy compared to the original method. In addition, Wake2Wake employs a novel SWA block designed to capture the characteristics of turbulent and Kelvin wakes in SAR images. Results from experiments have proven the effectiveness of this block. Nevertheless, additional studies are required to achieve a tradeoff between wave suppression and wake detection. The next step involves merging the Wake2Wake framework with the OBB detector to create a real end-to-end network, following the approach illustrated in Fig. 1(b).

ACKNOWLEDGMENT

The authors would like to thank the editors who processed this article and the anonymous reviewers for their constructive comments toward improving it. They want to express our gratitude to the Alaska Satellite Facility and European Space Agency for the Sentinel, ALOS PALSAR, and TerraSAR-X data, available at https://search.asf.alaska.edu/#/ and https://earth.esa.int/eogateway/missions/terrasar-x-and-tandem-x.






 

 

No comments:

Post a Comment

TMTT CFP Special Issue on Latest Advances on Radar-Based Physiological Sensors and Their Applications

Radar can be used for human non-contact monitoring and interaction TMTT CFP Special Issue on Latest Advances on Radar-Based Physiological Se...