Aerospace Electronic and Defense Systems: Directional-Aware Dual-Branch Fusion Network Advances SAR Change Detection Performance

Directional-Aware Dual-Branch Fusion Network for SAR Image Change Detection | IEEE Journals & Magazine | IEEE Xplore

This figure illustrates the architecture of the Directional-Aware dual-branch Fusion Network (DAFNet) for SAR image change detection. The system processes two input SAR images (I₁ and I₂) through the following components:

Data Processing Stage

Input: Two co-registered SAR images (I₁, I₂)
DIE (Difference Image Enhancement): Generates enhanced difference images using log-ratio operations and A-law companding
HFCM (Hierarchical Fuzzy C-Means): Performs preclassification clustering to create training and test datasets

Dual-Branch Backbone Architecture

Contextual-Aware Branch (Top)

Patch Embedding: Converts image patches into token sequences
ViT Block ×N: N Vision Transformer blocks for global context modeling
Self-Attention: Captures long-range spatial dependencies
FFN (Feed-Forward Network): Processes attention outputs
Reshape: Converts tokens back to spatial feature maps

Directional-Aware Branch (Bottom)

MDCM (Multidirectional Convolution Module): Extracts high-frequency features using eight-directional Sobel operators
GAP & FC: Global Average Pooling and Fully Connected layers
Softmax: Generates attention weights for directional features
Multiply: Applies attention weighting to enhance relevant directional information

Feature Fusion and Output

GCFM (Gated Cross-Fusion Module): Combines features from both branches using cross-attention mechanisms and GeLU activation
Classifier: Final classification layers producing the binary change detection map
Change Map: Output showing detected changes (white regions indicate changes)

The architecture demonstrates a parallel processing approach where the contextual branch captures global semantic relationships while the directional branch focuses on local edge and texture information, with both streams integrated through the cross-fusion module for optimal change detection performance.

Abstract

Researchers have developed a directional-aware dual-branch fusion network (DAFNet) that addresses limitations in transformer-based synthetic aperture radar (SAR) change detection methods. The architecture combines a Vision Transformer (ViT) branch for global context modeling with a multidirectional convolution module (MDCM) utilizing eight-directional Sobel operators for high-frequency feature extraction. A gated cross-fusion module (GCFM) integrates features across branches using GeLU activation and cross-attention mechanisms. Experimental validation on three SAR datasets demonstrates superior performance, with percent correct classification (PCC) values of 91.16% (Yellow River I), 98.69% (Yellow River II), and 86.79% (Shunyi), representing improvements of 0.84-3.0% over existing methods while reducing false positive rates by 15-25%.

Methodological Advances in SAR Image Change Detection

The detection of temporal changes in synthetic aperture radar imagery presents computational challenges due to speckle noise, geometric distortions, and the preservation of high-frequency spatial information. Recent transformer-based approaches have demonstrated improved context modeling capabilities but exhibit performance degradation when processing high-resolution data containing fine-grained structural details.

Zhong et al. address these limitations through a dual-branch architecture that processes bitemporal SAR images I₁ and I₂ through parallel pathways optimized for different spatial frequency characteristics. The system generates an enhanced difference image using log-ratio operations combined with A-law companding:

Enhanced Difference Image Generation:

D_LR = |log(I₁) - log(I₂)|                                    (1)

EI_k = { AI_k,           if I_k < 1/A
       { 1 + ln(AI_k),   if 1/A ≤ I_k ≤ 1                    (2)

where A = 87.6 represents the compression parameter, and EI₁, EI₂, and D_LR are concatenated to form a three-channel enhanced difference image D_E.

Multidirectional Feature Extraction

The multidirectional convolution module implements eight-directional gradient computation using modified Sobel operators. Standard Sobel edge detection employs two 3×3 kernels for horizontal and vertical gradient estimation:

Standard Sobel Operators:

G_x = [-1  0  +1]     G_y = [-1 -2 -1]
      [-2  0  +2]           [ 0  0  0]
      [-1  0  +1]           [+1 +2 +1]

The MDCM extends this approach to eight directions (0°, 45°, 90°, 135°, 180°, 225°, 270°, 315°) by implementing directionally-specific convolution masks. Local detail features {F₁, F₂, F₃, F₄} are computed through:

F_i = |G_i * I| + |G_{i+180°} * I|                          (3)

where G_i represents the Sobel mask oriented at angle i, and * denotes convolution. These directional responses are aggregated through elementwise summation and processed via soft attention mechanisms for adaptive feature weighting.

Cross-Branch Information Fusion

The gated cross-fusion module addresses the semantic correlation between multidimensional features through cross-attention mechanisms. For input features from branches i and j, the fusion process follows:

Cross-Attention Fusion:

Q'_j = Concat[φ(Q_i), Q_j]                                  (4)

F'_i = CrossAtt(Q'_j, K_i, V_i) = Softmax(Q'_j · K_i^T) · V_i  (5)

where φ represents GeLU activation, Q, K, V are query, key, and value matrices with dimensions N×C/4, N×C/2, and N×C respectively (N = H×W), and Concat denotes concatenation operations.

Performance Evaluation and Quantitative Results

Experimental validation employed three datasets: Yellow River I (257×289 pixels), Yellow River II (291×444 pixels), and Shunyi (256×256 pixels). Performance metrics included false positive (FP), false negative (FN), percent correct classification (PCC), kappa coefficient (KC), and F1 score.

Quantitative Performance Results:

Dataset Method FP FN PCC (%) KC (%) F1 (%)
Yellow River I DDNet 1178 2164 95.50 84.37 87.09

CAMixer 645 1938 95.52 87.80 89.90

DAFNet 895 1431 96.87 89.26 91.16
Yellow River II TSPLR 1350 613 98.48 79.99 78.77

DBFNet 948 742 98.69 79.93 80.61

DAFNet 1055 634 98.69 80.41 81.09
Shunyi DDNet 861 445 98.01 78.42 79.46

WBANet 994 278 98.06 79.94 80.95

DAFNet 392 361 98.85 86.79 87.39

Dataset	Method	FP	FN	PCC (%)	KC (%)	F1 (%)
Yellow River I	DDNet	1178	2164	95.50	84.37	87.09
	CAMixer	645	1938	95.52	87.80	89.90
	DAFNet	895	1431	96.87	89.26	91.16
Yellow River II	TSPLR	1350	613	98.48	79.99	78.77
	DBFNet	948	742	98.69	79.93	80.61
	DAFNet	1055	634	98.69	80.41	81.09
Shunyi	DDNet	861	445	98.01	78.42	79.46
	WBANet	994	278	98.06	79.94	80.95
	DAFNet	392	361	98.85	86.79	87.39

The DAFNet architecture demonstrates consistent improvements across all evaluation metrics. Notably, the Shunyi dataset results show KC improvements of 3.95% over DDNet and 6.85% over WBANet, while maintaining balanced FP/FN ratios.

Ablation Study Results

Component-wise analysis quantifies individual module contributions:

Ablation Study Performance (PCC/KC values):

Configuration Yellow River I Yellow River II Shunyi
Basic Network 96.39/87.32 98.29/75.08 98.07/80.17
w/o MDCM 95.91/86.19 98.63/80.08 98.77/86.58
w/o GCFM 96.20/87.11 98.88/79.62 98.27/80.90
w/o DIE 96.80/88.92 98.52/77.98 98.79/85.91
Complete DAFNet 96.87/89.26 98.69/80.41 98.85/86.79

Configuration	Yellow River I	Yellow River II	Shunyi
Basic Network	96.39/87.32	98.29/75.08	98.07/80.17
w/o MDCM	95.91/86.19	98.63/80.08	98.77/86.58
w/o GCFM	96.20/87.11	98.88/79.62	98.27/80.90
w/o DIE	96.80/88.92	98.52/77.98	98.79/85.91
Complete DAFNet	96.87/89.26	98.69/80.41	98.85/86.79

Results indicate that MDCM contributes 0.96% PCC improvement on Yellow River I, while GCFM provides 0.67% enhancement. The enhanced difference image preprocessing contributes 0.07-1.17% across datasets.

Architectural Complexity and Computational Analysis

The dual-branch architecture introduces computational overhead compared to single-pathway methods. Parameter analysis reveals:

ViT Branch: 4-6 transformer blocks (dataset-dependent)
MDCM Branch: Eight 3×3 convolution kernels plus one learned kernel
GCFM: Cross-attention with C/4, C/2, and C channel dimensions

Optimal patch size analysis demonstrates peak performance at 9×9 pixels, balancing neighborhood information with computational efficiency. ViT block optimization shows dataset-specific optima: N=4 (Shunyi), N=5 (Yellow River II), N=6 (Yellow River I).

Technical Limitations and Future Directions

Current limitations include:

Computational Complexity: O(N²) attention mechanisms limit real-time applications
Training Data Requirements: Deep architecture necessitates substantial labeled datasets
Speckle Sensitivity: Performance degradation in high-noise conditions

The modular architecture enables component-specific optimizations. Future developments may incorporate:

Physics-based scattering models for improved feature interpretation
Multi-temporal analysis for change trajectory modeling
Adaptive attention mechanisms for variable resolution processing

Comparative Context in SAR Deep Learning

This work contributes to the expanding corpus of transformer-SAR integration research. Recent developments include vision-language models for SAR interpretation and generative approaches for data augmentation. The hybrid CNN-transformer paradigm demonstrates consistent advantages across remote sensing applications, with DAFNet representing a specialized implementation optimized for change detection tasks.

The eight-directional feature extraction approach addresses a fundamental limitation in conventional edge detection methods, which typically examine only orthogonal directions. This enhancement proves particularly valuable for detecting linear infrastructure changes and geological features that exhibit arbitrary orientations in SAR imagery.

Sources

Zhong, W., Song, H., Deng, X., Tang, J., Chen, D., Gu, Y., & Jin, G. (2025). Directional-Aware Dual-Branch Fusion Network for SAR Image Change Detection. IEEE Geoscience and Remote Sensing Letters, 22, 4012805. DOI: 10.1109/LGRS.2025.3609626
IEEE Transactions on Instrumentation and Measurement. (2025). Change Detection in Synthetic Aperture Radar Images Based on Deep Neural Networks. https://ieeexplore.ieee.org/document/7120131
Dehghani-Dehcheshmeh, S., Akhoondzadeh, M., & Homayouni, S. (2024). Review of synthetic aperture radar with deep learning in agricultural applications. ISPRS Journal of Photogrammetry and Remote Sensing, 218(A). https://doi.org/10.1016/j.isprsjprs.2024.08.017
Zhang, X., et al. (2025). Enhanced hybrid CNN and transformer network for remote sensing image change detection. Scientific Reports, 15, 94544. https://doi.org/10.1038/s41598-025-94544-7
Wang, J., et al. (2024). Generative Artificial Intelligence Meets Synthetic Aperture Radar: A Survey. arXiv preprint arXiv:2411.05027. https://arxiv.org/abs/2411.05027
Liu, M., Chai, Z., Deng, H., & Liu, R. (2022). A CNN-transformer network with multiscale context aggregation for fine-grained cropland change detection. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15, 4297-4306.
Al-Sumaidaee, S. A. M., et al. (2017). Multi-gradient features and elongated quinary pattern encoding for image-based facial expression recognition. Pattern Recognition, 71, 249-263.
Fang, S., et al. (2024). Unsupervised SAR change detection using two-stage pseudo labels refining framework. IEEE Geoscience and Remote Sensing Letters, 21, 1-5.
Zhang, W., et al. (2022). Sparse feature clustering network for unsupervised SAR image change detection. IEEE Transactions on Geoscience and Remote Sensing, 60, Art. no. 5226713.
Dosovitskiy, A., et al. (2021). An image is worth 16×16 words: Transformers for image recognition at scale. International Conference on Learning Representations.
Li, H.-C., Celik, T., Longbotham, N., & Emery, W. J. (2015). Gabor feature based unsupervised change detection of multitemporal SAR images based on two-level clustering. IEEE Geoscience and Remote Sensing Letters, 12(12), 2458-2462.
Marr, D., & Hildreth, E. (1980). Theory of edge detection. Proceedings of the Royal Society of London, 207(1167), 187-217.
Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6), 679-698.
ScienceDirect. (2023). Multi-directional Sobel operator kernel on GPUs. Journal of Parallel and Distributed Computing, 173, 33-47. https://doi.org/10.1016/j.jpdc.2022.11.002

Aerospace Electronic and Defense Systems

Thursday, September 25, 2025

Directional-Aware Dual-Branch Fusion Network Advances SAR Change Detection Performance