A Ground-Penetrating Radar Target Detection Method Using Dual-Attribute Feature Fusion | IEEE Journals & Magazine | IEEE Xplore
The AI That Sees Through Roads
A new generation of AI-powered ground-penetrating radar systems is learning to spot the deadly voids forming beneath city streets before they swallow buses, cars, and pedestrians whole.
⬛ Bottom Line Up Front
A research team at Xinjiang University has published a breakthrough GPR target-detection framework — Fuse-DETR — that fuses two complementary radar image modalities (conventional B-Scan and Hilbert-transform–derived instantaneous amplitude images) through a novel dual-branch deep neural network built on the RT-DETR transformer architecture. In head-to-head testing against seven state-of-the-art detectors, Fuse-DETR achieved a recall of 89.8%, an F1-Score of 85.6%, and a mean average precision (mAP) of 81.16% at 43 frames per second — significantly outperforming all single-modality competitors. The work, published in IEEE Transactions on Geoscience and Remote Sensing (Vol. 64, 2026), also delivers the field's first paired, drilling-verified multimodal GPR dataset for urban roads. As cities worldwide face a growing wave of subsurface collapses driven by aging infrastructure, this class of technology represents the most promising path toward automated, real-time underground hazard detection.
The road looked perfectly normal to drivers passing through Urumqi's Midong District on a warm summer night. Beneath it, however, something was quietly disappearing. Leaking sewer pipes had been silently dissolving the compacted soil for months, leaving a cave-like void the size of a shipping container just centimeters below the asphalt. Left undetected, it would eventually become a statistic: one more collapsed road, one more emergency closure, one more bill in the hundreds of thousands of dollars. But the city had sent out a different kind of patrol that night — a slow-rolling van bristling with antennas and connected to banks of computing hardware, scanning the pavement with ground-penetrating radar at a crawl of less than 10 kilometers per hour. A researcher at Xinjiang University, reviewing the van's imagery back at the lab, spotted the anomaly using a new AI system called Fuse-DETR. A drill crew confirmed the void the next morning.
This is the promise — and increasingly the reality — of artificial intelligence applied to ground-penetrating radar: a technology originally developed for Cold War military applications, now racing to keep pace with an urban infrastructure crisis that has become a genuine public safety emergency.
The Problem Under Our Feet
Road collapses are not rare. They are, in fact, accelerating. A 2025 susceptibility-mapping study of Shanghai's central district found that underground pipeline structural problems are the single most influential predictor of urban road collapse, and researchers found statistically significant spatial clustering of high-risk zones in the city's older, northwestern quadrant. Seoul's metropolitan government has maintained a dedicated GPR-cavity exploration team for years, an investment the city credits with reducing both cavity occurrences and actual subsidence events. Japan's Ministry of Land, Infrastructure, Transport and Tourism has stood up a national committee specifically targeting road collapses associated with aging sewer systems. In the United Kingdom, sinkholes have been appearing "at an alarming rate" following heavy rainfall events, with chalk and limestone geology making entire counties vulnerable to sudden ground failure.
The common thread in all these cases is the same: the voids that kill people and destroy infrastructure form invisibly, growing in silence for months or years before the surface gives way — often without warning, often under active traffic. The 2016 Fukuoka sinkhole that swallowed a six-lane intersection overnight was an extreme example of what happens when detection fails. Dozens of similar but less spectacular events occur globally every year.
- 89.8% Recall Rate, Fuse-DETR
- 43 Frames Per Second (Real-Time)
- 100km Urban Roads Surveyed for Dataset
Radar That Reads the Earth
Ground-penetrating radar works by transmitting high-frequency broadband electromagnetic pulses — typically between 1 MHz and 1 GHz — into the ground through a transmitting antenna, then capturing the echoes that bounce back from subsurface dielectric interfaces using a receiving antenna. By analyzing the spatial and temporal characteristics of those reflections, a skilled analyst can infer the location, dimensions, and physical properties of buried targets: pipes, voids, rebar grids, utility conduits, and the subtle loosening of soil that precedes a collapse.
The standard visualization is called a B-Scan: a two-dimensional cross-section assembled from hundreds of individual A-Scan waveforms (one vertical radar trace per position), displayed as a grayscale image where bright hyperbolic arcs reveal buried cylindrical objects like pipes, and diffuse anomalies suggest voids or loosened soil. Experienced analysts read these images the way a radiologist reads an X-ray — with practiced intuition built over years of field work and the perpetual risk of missing something subtle.
That human bottleneck is, increasingly, the problem. A vehicle-mounted GPR system surveying city streets at night can generate hundreds of thousands of image pairs in a single operational run. The Xinjiang University team's 100-kilometer survey of Urumqi roads alone produced 344,360 B-Scan and paired instantaneous amplitude (IA) image pairs, each at 480×480 pixel resolution. No human analyst can keep pace with that volume at scale, and the consequences of fatigue-driven misses can be catastrophic.
"During large-scale urban road inspections, the massive volume of GPR data renders manual interpretation inefficient and impractical."
— Sun et al., IEEE Transactions on Geoscience and Remote Sensing, 2026The Two-Eyed Machine
The central insight of the Fuse-DETR paper — published in IEEE Transactions on Geoscience and Remote Sensing, Volume 64, in March 2026 — is deceptively straightforward: B-Scan images and instantaneous amplitude (IA) images are not redundant. They see different things.
A B-Scan preserves waveform morphology and spatial continuity. It is excellent for characterizing target geometry and boundary structures, revealing the shapes of pipes and the extent of hollow spaces. But for deeply buried or weak-reflection targets — the kind of subtle subsurface loosening that precedes a collapse — B-Scan responses can be faint to the point of invisibility, lost in the noise of surrounding soil reflections and signal attenuation.
An instantaneous amplitude image, derived from each A-Scan trace via the Hilbert transform, takes an entirely different view of the same data. The IA image displays the envelope — the energy — of the reflected signal at each point, making weak-reflection anomalies far more visually salient. Where a B-Scan might show a faint smear in a noisy background, the IA image reveals a concentrated bright zone of anomalous energy. The tradeoff is that IA images sacrifice precise geometric information: they tell you something is there, not exactly what shape it is.
The Fuse-DETR framework exploits this complementarity through a dual-branch backbone architecture. Two independent ResNet-18 convolutional networks process the B-Scan and IA images in parallel, learning modality-specific features without interference. Their outputs then flow into the paper's most innovative component: the Modality-Collaborative Fusion Module, or MCFM.
- Dual-Channel Acquisition. A vehicle-mounted GPR system collects raw A-Scan traces simultaneously. Every 480 traces are assembled into a B-Scan image. The Hilbert transform is applied to each trace to compute the analytic signal envelope, generating a paired instantaneous amplitude (IA) image.
- Parallel Feature Extraction. Two independent ResNet-18 backbone networks extract multiscale feature maps from the B-Scan and IA images separately, preserving each modality's unique characteristics without cross-contamination.
- Multiscale Edge Information Enhancement (MSEIE). Before fusion, each branch's feature maps pass through a multiscale edge enhancement module that extracts and amplifies high-frequency boundary information at four spatial scales, improving detection of small and weak targets.
- Modality Difference Modeling. The MCFM computes the element-wise difference between the edge-enhanced B-Scan and IA features, applies global average pooling and a Sigmoid activation to generate channel-wise attention weights, and uses those weights to modulate how much complementary information from each modality corrects and enriches the other.
- Small Object Enhance Pyramid (SOEP). To address the detection of small subsurface targets (most bounding boxes in the dataset are under 80 pixels wide), Fuse-DETR adds a shallow S2 feature branch using SPDConv, a depthwise separable convolution that avoids the spatial resolution loss of standard strided convolutions, and fuses it with deeper feature layers through a CSP-OmniKernel module.
- RT-DETR Detection Head. The fused multiscale features feed into a Real-Time Detection Transformer head that performs end-to-end target localization and classification, outputting bounding boxes and confidence scores for seven target categories: regular hollow, irregular hollow, generally loose, severely loose, pipeline, pipe gallery, and rebar.
A Dataset Built on Drilling
Any machine learning system is only as good as the data it was trained on — and in GPR target detection, dataset quality is a persistent problem. Subsurface targets are invisible from the surface. Annotation requires either expert interpretation of radar imagery (inherently subjective and error-prone) or physical ground-truth verification through drilling. Most published GPR datasets rely on the former. The Xinjiang University team chose the harder path.
After collecting their 100-kilometer Urumqi road survey using an LTD-60 Road Comprehensive Diagnosis System equipped with 270 MHz and 400 MHz antenna arrays, the team randomly sampled error-prone targets, used integrated GPS-RTK positioning to locate them on the surface, and dispatched crews to drill boreholes and inspect subsurface conditions with an endoscope. Cavity depth was measured with a ruler. Annotation bounding boxes were adjusted based on physical findings. The resulting dataset — 2,158 annotated images with 3,568 bounding boxes across seven target categories — is the first multimodal paired GPR dataset for urban roads with drilling-verified ground truth.
The class imbalance is striking and realistic: rebar accounts for 1,384 bounding boxes while irregular voids account for just 101. This reflects actual urban road conditions and puts significant demands on any detection algorithm. The team supplemented the real-world data with 3,500 synthetic pairs generated using the gprMax finite-difference time-domain electromagnetic simulator — a standard tool in the GPR community for physics-accurate signal modeling — and validated cross-dataset performance with a separate hand-pushed GPR survey dataset collected with a 400 MHz single-channel system along orthogonal road transects, with each target confirmed by on-site drilling.
Beating the State of the Art
The performance comparisons in the paper are comprehensive and methodologically careful. All single-modality comparison models — Faster R-CNN, DAB-DETR, Deformable-DETR, RT-DETR (the baseline architecture Fuse-DETR is built on), YOLOX, YOLOv8, YOLO11, and the recently published GPR-specific GN-YOLO — were trained exclusively on B-Scan images, the current industry standard. Fuse-DETR alone received paired B-Scan and IA inputs.
The results are unambiguous. The single-modality RT-DETR baseline — itself a strong performer — achieved an mAP of 0.771, recall of 0.870, and an F1-Score of 0.824 at 118 frames per second. Fuse-DETR improved every metric while still running at a real-time 43 FPS: mAP of 0.816, recall of 0.898, F1-Score of 0.856. Perhaps most telling, an IA-only configuration of RT-DETR dramatically underperformed the B-Scan model (mAP 0.588 vs. 0.771), demonstrating that the IA channel alone is insufficient — it is the fusion of both that delivers the improvement. The paper also benchmarks against multimodal fusion models borrowed from the infrared/visible-light computer vision domain (ICAFusion, IC-Fusion, CMAF-Fusion), reporting that Fuse-DETR outperforms all of them with fewer parameters and lower computational cost.
The ablation studies — systematic experiments that isolate the contribution of each architectural component — are particularly illuminating for engineers. Adding the dual-branch structure alone (without the MCFM or SOEP) over the B-Scan-only baseline improved mAP from 0.771 to 0.787. Adding MCFM raised it to 0.792. Adding SOEP for small-target enhancement brought it to the final 0.816. Each module earns its keep.
The Broader Wave
The Fuse-DETR paper arrives at the crest of a broader wave of AI-driven GPR research. A July 2025 comprehensive review published in Applied Sciences surveyed AI-driven GPR interpretation across utility detection, infrastructure monitoring, archaeology, and environmental studies, finding that deep learning has moved from promising experiments to production-ready results in a range of domains. Convolutional neural networks — particularly YOLO variants and Faster R-CNN — have become the dominant tools for hyperbola detection and subsurface object localization in B-Scan imagery, achieving real-time performance suitable for field deployment.
A parallel 2025 study in Data Brief introduced a publicly available GPR dataset specifically designed for deep learning–based detection of subsurface utilities and voids, addressing the persistent scarcity of labeled training data that has historically constrained the field. A separate research group published a novel 3DReconNet in late 2025, reformulating the subsurface reconstruction problem as a semantic segmentation task — predicting the material type of each underground voxel from GPR data — achieving simultaneous geometry recovery and material identification. At the University of Melbourne, researchers applied YOLOv8, YOLOv11, and Mask R-CNN to 3D reconstruction of buried linear utilities from high-resolution B-Scan data, with Mask R-CNN achieving the highest keypoint F1-score (0.822) for precise pipe localization.
In pavement defect modeling, a March 2026 study published in Applied Sciences demonstrated how Cycle-Consistent Generative Adversarial Networks (CycleGANs) can bridge the domain gap between physics-simulated GPR images and real-world data — a critical capability for expanding training datasets without expensive field surveys. The simulation-to-reality transfer approach, tested against four YOLO variants, showed that domain adaptation substantially reduces the amount of real annotated data required to achieve high detection accuracy.
"Recall has greater practical engineering significance than aggregate evaluation metrics such as mAP or F1-Score" in safety-critical GPR detection tasks.
— Sun et al., IEEE Transactions on Geoscience and Remote Sensing, 2026Limitations and the Road Ahead
The Fuse-DETR framework is not without limitations, and the authors are candid about them. The model's 47.9 million parameters make it considerably larger than lightweight YOLO-family detectors (YOLO11 manages comparable tasks with just 2.62 million parameters), and its 43 FPS throughput, while real-time by most standards, lags well behind YOLO's 197–238 FPS. For vehicle-mounted inspection at low survey speeds, this is not a practical constraint — but for future applications demanding extremely high-speed or embedded-platform deployment, lightweighting will be essential.
Cross-dataset performance also shows expected degradation on the hand-pushed resurvey dataset, attributable to two factors the authors carefully diagnose: nonuniform manual walking speed causing spatial sampling inconsistencies, and hardware differences between the vehicle-mounted and hand-pushed systems (different antenna frequencies, bandwidths, and antenna geometries), which shift the feature distributions the model was trained on. Domain adaptation and transfer learning — active research areas in the broader computer vision community — are likely paths to addressing this gap.
The authors outline a clear research agenda: model lightweighting through compact backbone design, channel pruning, and optimized cross-modal fusion; extension to additional modalities (instantaneous phase and instantaneous frequency, which prior work suggests provide complementary information to instantaneous amplitude); and 3D volumetric detection using multi-channel antenna arrays that can image subsurface volumes rather than 2D cross-sections.
Meanwhile, cities are not waiting for perfection. Seoul's GPR cavity detection program, Japan's national sinkhole prevention initiative, and the growing adoption of vehicle-mounted GPR systems by municipalities from Hong Kong to New Jersey all reflect a deepening recognition that the cost of subsurface ignorance — measured in emergency repairs, insurance claims, lawsuits, and lives — far exceeds the cost of regular automated inspection. As AI systems like Fuse-DETR continue to mature, the gap between what radar can see and what algorithms can find is closing fast. The road above may look fine. But now, at least, we're getting much better at knowing what's happening below.
Verified Sources & Formal Citations
-
Sun, F., Zhang, X., Gao, Y., & Chen, W. (2026). "A Ground-Penetrating Radar Target Detection Method Using Dual-Attribute Feature Fusion." IEEE Transactions on Geoscience and Remote Sensing, Vol. 64, Article 5906819. DOI: 10.1109/TGRS.2026.3677660. Published March 25, 2026.
https://ieeexplore.ieee.org/document/10.1109/TGRS.2026.3677660 -
Jafuno, D., Mian, A., Ginolhac, G., & Stelzenmuller, N. (2025). "Bridging Theory and Practice: A Review of AI-Driven Techniques for Ground Penetrating Radar Interpretation." Applied Sciences, 15(15), 8177. DOI: 10.3390/app15158177. Published July 23, 2025.
https://www.mdpi.com/2076-3417/15/15/8177 -
Benali, A., et al. (2025). "Intelligent
recognition of subsurface utilities and voids: A ground penetrating
radar dataset for deep learning applications." Data in Brief, 59, 111338. DOI: 10.1016/j.dib.2025.111338. Published January 28, 2025.
https://www.sciencedirect.com/science/article/pii/S2352340925000708 -
Cheng, Z., He, Z., & Pan, P. (2025). "3D reconstruction of subsurface pipes and cavities using ground penetrating radar based on deep learning." NDT & E International, 158, 103579. Published October 2025.
https://www.sciencedirect.com/science/article/abs/pii/S0963869525002609 -
Corradini, E., et al. (2025). "Deep Learning and Geometric Modeling for 3D Reconstruction of Subsurface Utilities from GPR Data." Remote Sensing, PMC12567710. Published October 2025.
https://pmc.ncbi.nlm.nih.gov/articles/PMC12567710/ -
Yang, J., Yu, S., Yao, Y., Cao, S., & Ai, X. (2026). "From Simulation to Reality: GAN-Based Transformation of Pavement Defect Images for YOLO Detection." Applied Sciences, 16(6), 2978. DOI: 10.3390/app16062978. Published March 19, 2026.
https://www.mdpi.com/2076-3417/16/6/2978 -
Zheng, J., et al. (2025). "Susceptibility mapping and risk assessment of urban sinkholes based on grey system theory." Bulletin of Engineering Geology and the Environment. Published 2025.
https://www.sciencedirect.com/science/article/abs/pii/S0886779824003110 -
Abubakar, A., et al. (2026). "Proactive sinkhole risk assessment in urban asphalt pavements using wavelet transform and the international roughness index." Automation in Construction. Published February 2026. [Notes Japan Ministry and Seoul Metropolitan Government programs.]
https://www.sciencedirect.com/science/article/abs/pii/S0926580526000592 -
Zhao, Y., et al. (2024). "DETRs Beat YOLOs on Real-Time Object Detection." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16965–16974. [RT-DETR baseline architecture underlying Fuse-DETR.]
https://openaccess.thecvf.com/CVPR2024 -
Screening Eagle Technologies. (2025). "Prevent
Sinkholes and Minimize Risks of Geohazards with Early Detection of
Cavities." Technical Application Note, GS8000/GS9000 GPR Systems.
Published August 2025.
https://www.screeningeagle.com/en/inspection/prevent-sinkholes-with-early-cavity-detection -
ITpipes. (2025). "How Failing Infrastructure Triggers Sinkholes: What Every Sewer and Water Crew Should Know." Published October 2025.
https://itpipes.com/blog/how-failing-infrastructure-triggers-sinkholes-what-every-sewer-and-water-crew-should-know/ -
Yan, T., Yang, J., Liu, Z., & Peng, A. (2018). "Application of instantaneous amplitude gradient for ground penetrating radar signal analyses." Arabian Journal of Geosciences, 11(20), 636. [Foundational work on instantaneous attributes cited in Fuse-DETR.]
https://doi.org/10.1007/s12517-018-3979-0 -
Zhang, H., et al. (2025). "A comprehensive
review of data processing and target recognition methods for ground
penetrating radar underground pipeline B-scan data." Discover Applied Sciences, Springer Nature. Published April 6, 2025.
https://link.springer.com/article/10.1007/s42452-025-06791-y

No comments:
Post a Comment