Tuesday, April 9, 2024

A high-performance universal miniature SAR/GMTI and space-borne imaging radar system

 high-performance universal miniature radar system | IEEE Conference Publication | IEEE Xplore

T. Jin, H. -X. Wang and H. -W. Liu, "A high-performance universal miniature radar system," 2016 CIE International Conference on Radar (RADAR), Guangzhou, China, 2016, pp. 1-5, doi: 10.1109/RADAR.2016.8059594.

Abstract: This paper proposes the design and realization of a high-performance universal miniature radar system. It presents a well solution to the main challenges of the radar system including extremely huge data flow and calculating burden, the traditional custom-built pattern of radar system, and the strict limitations for the size, weight and power consumption of the airborne or space-borne real-time Synthetic Aperture Radar(SAR) signal processing systems. The system has showed the virtues of standardization, modularization, stability, reconstruction, good adaptability due to the combined application of the distributed parallel architecture, latest interconnection standard and processor. By the successful application cases of airborne SAR/GMTI and space-borne imaging, its high-performance universality and miniature property could be adequately proved.
 
Date of Conference: 10-13 October 2016
Date Added to IEEE Xplore: 05 October 2017
ISBN Information:
Publisher: IEEE
Conference Location: Guangzhou, China 
 

Authors

The senior author, Hongwei Liu (Senior Member, IEEE) received the M.S. and Ph.D. degrees in electronic engineering from Xidian University, Xi’an, China, in 1995 and 1999, respectively.,He worked with the National Laboratory of Radar Signal Processing, Xidian University. 

From 2001 to 2002, he was a Visiting Scholar at the Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA. 

He is currently a Professor with the National Laboratory of Radar Signal Processing, Xidian University. His research interests include radar automatic target recognition, radar signal processing, and adaptive signal processing.(Based on document published on 7 December 2023).

Summary 

The paper proposes the design and realization of a high-performance universal miniature radar system. The key features of the system include:

  1. A distributed parallel processing architecture using DSP+FPGA structure to handle the extremely huge data flow and computational burden of radar signal processing.
  2. Modularity and reconfigurability achieved through a standardized 3U VPX form factor and interconnection framework, allowing easy updating and reconstruction of the system.
  3. Miniaturization to meet size, weight, and power constraints for airborne and space-borne applications through careful board-level design.
  4. A multi-layer interconnection structure using high-speed serial networks (RapidIO, PCIe), synchronization timing buses, and control buses to facilitate data transfer and system coordination.
  5. Successful application in real-time airborne SAR imaging and space-borne SAR imaging, demonstrating its high performance, universality, and miniature properties.

The paper describes the system architecture, interconnection theory, signal processing units, data flows, imaging algorithms, and resource utilization details. The proposed system offers a flexible and scalable solution for diverse radar applications with demanding performance requirements.

data flow and computational burden

The paper discusses the extremely huge data flow and computational burden associated with radar signal processing, especially for real-time applications like synthetic aperture radar (SAR) imaging. It highlights these challenges as the main motivation for developing a high-performance distributed parallel processing architecture.

Regarding the data flow, the paper mentions that in real-time systems, the speed of signal processing must be faster than the signal acquisition rate to ensure that all continuous echo data can be processed. It gives an example that in a large-scale spotlight SAR mode, each processing node finishes processing one real-time image of 32K*16K (4GB) complex points within 23.89 seconds.

As for the computational burden, the paper states that although chip technology and processing power have increased, a single chip still cannot satisfy the operation requirements of 10 GFLOPS or even 100 GFLOPS needed in real-time imaging cases. Therefore, parallel processing becomes imperative to achieve the required high performance. 

The proposed system employs a DSP+FPGA architecture, with each DSP (TMS320C6678) providing:

  • 160 GFLOPS of peak performance when all 8 cores are fully utilized.
  • Eight TMS320C66x DSP Core Subsystems at 1.00 GHz and 1.25GHz
  • 320 GMAC/160 GFLOP @ 1.25GHz
  • 32KB L1P, 32KB L1D, 512KB L2 Per Core
  • 4MB Shared L2

In summary, the huge data volumes (multiple gigabytes) generated from radar echoes and the demand for gigaflop-scale processing capabilities, especially in real-time SAR imaging modes, necessitate the high-performance distributed parallel processing approach taken in this radar system design.

high-performance distributed parallel processing architecture

The proposed high-performance distributed parallel processing architecture for the miniature radar system is based on a combination of Digital Signal Processor (DSP) and Field Programmable Gate Array (FPGA) processors, interconnected through a multi-layer network structure.

  1. DSP+FPGA Structure:
    • The system employs a DSP+FPGA structure to handle signal processing tasks.
    • FPGAs (Xilinx Virtex-6 XC6VLX240T) are used for pre-processing and relatively simple but computationally intensive operations like multiplication, accumulation, and FFT.
    • DSPs (Texas Instruments TMS320C6678) handle more complex arithmetic operations and high-level processing.
    • Each signal processing board contains 2 DSPs, each with 2GB DDR3 memory.
  2. Distributed Parallel Structure:
    • The system adopts a distributed parallel structure where each processing node has its own physically distributed memory.
    • Multiple processing nodes are combined through a high-bandwidth, low-latency, customized communication network to form a larger processing scale.
    • This structure allows for coarse-grained processing and flexible system framework, enabling easy scalability by adding or removing processors.
  3. Virtual Single Node:
    • To meet real-time requirements and handle the huge amount of echo data in high-resolution SAR modes, the system implements a virtual single node concept.
    • Each signal processing board is considered a virtual single node, consisting of two DSPs connected by a high-speed Hyperlink interface (12.5 Gbps).
    • The virtual nodes process data frames independently in a pipelined parallel manner, while each DSP within a node processes a portion of the data frame in parallel.
  4. Multi-Layer Interconnection:
    • The system features a multi-layer interconnection network to facilitate data transfer and coordination among various modules.
    • High-speed serial networks like RapidIO (6.25 Gbps) and PCIe (5 Gbps) are used for transferring raw data, pre-processed data, result data, and control instructions.
    • A synchronization timing network ensures strict synchronization among different parts of the system.
    • A control bus is used for low-latency, real-time control signal transfer.

This distributed parallel processing architecture, combined with the multi-layer interconnection network, enables the system to handle the demanding computational requirements of real-time radar signal processing while maintaining flexibility and scalability.

Node Capacity

Based on the information provided in the paper, the capacity per node can be analyzed from two aspects: the processing capability and the storage capacity.

  1. Processing Capability per Node:
    • Each signal processing board, which is considered a virtual single node, consists of 2 DSP chips (TMS320C6678).
    • Each TMS320C6678 DSP has 8 cores running at up to 1.25 GHz, providing a peak performance of 320 GMACS (Giga Multiply-Accumulate operations per second) or 160 GFLOPS (Giga Floating-Point Operations per Second).
    • With 2 DSPs per node, the total processing capability of a single node is 640 GMACS or 320 GFLOPS.
  2. Storage Capacity per Node:
    • Each DSP on the signal processing board is equipped with 2 GB of DDR3 memory.
    • With 2 DSPs per node, the total storage capacity of a single node is 4 GB.

The paper also provides examples of the processing capacity per node in different SAR imaging modes:

  • In the large-scale spotlight mode, each processing node finishes one real-time image of 32K*16K (4 GB) complex points within 23.89 seconds.
  • In the detailed strip mode, each processing node finishes a dual-polarization real-time image, with each DSP responsible for one polarization image (16K*16K complex points, 4 GB total for the dual-polarization image) within 23.76 seconds.

It's important to note that the system is designed to be scalable, allowing for the addition of more processing nodes to handle more complex algorithms and imaging modes as needed.

Scalability to Radar Requirements

To assess how well the proposed architecture meets the radar requirements and its efficiency in combining node capacity, we need to compare the system's capabilities with the computational demands of the radar signal processing tasks.

  1. Meeting Radar Requirements:
    • The paper mentions that real-time radar signal processing, especially in SAR imaging cases, requires operation capabilities of 10 GFLOPS or even 100 GFLOPS.
    • A single node in the proposed system provides a peak performance of 320 GFLOPS, which is more than sufficient to meet the mentioned requirements.
    • The system's ability to process large-scale spotlight mode images (32K16K complex points) and detailed strip mode dual-polarization images (16K16K complex points per polarization) within around 24 seconds demonstrates its capability to handle demanding radar signal processing tasks in real-time.
  2. Efficiency in Combining Node Capacity:
    • The distributed parallel processing architecture allows for efficient combination of node capacities through the multi-layer interconnection network.
    • The high-speed serial networks (RapidIO at 6.25 Gbps and PCIe at 5 Gbps) provide sufficient bandwidth for data transfer among processing nodes and storage modules, minimizing communication bottlenecks.
    • The virtual single node concept, where each node consists of two DSPs connected by a high-speed Hyperlink interface (12.5 Gbps), enables efficient parallel processing within a node.
    • The pipelined parallel processing approach, where each node processes data frames independently, further enhances the overall system efficiency.
  3. Scalability and Resource Utilization:
    • The modular design and distributed architecture allow for easy scalability by adding more processing nodes to meet increasing computational demands.
    • The paper mentions that the processing capability, AD/DA working speed, I/O module usage, and high-speed bus bandwidth are not fully utilized in the presented application cases, indicating that the system has the potential to handle more complex algorithms and imaging modes.

In summary, the proposed high-performance distributed parallel processing architecture efficiently combines node capacities to meet and exceed the computational requirements of radar signal processing tasks. The system's scalability and resource utilization efficiency make it adaptable to various radar applications with demanding performance needs.

SWAP

The paper does not provide explicit details about the size, weight, and power (SWaP) requirements per node or for the full radar system. However, it does mention that miniaturization is a key design consideration, especially for airborne and space-borne applications where SWaP constraints are critical.

  1. Size and Form Factor:
    • The entire system hardware boards use the 3U VPX standard, which specifies a board size of 100 mm by 160 mm.
    • The use of this compact form factor contributes to the system's miniaturization goals.
    • However, the exact dimensions of the complete system, including the housing and cooling components, are not specified.
  2. Weight:
    • The paper does not provide any information about the weight of the individual nodes or the complete radar system.
    • However, it mentions that the system is designed to meet the strict limitations on weight for airborne and space-borne applications.
  3. Power Requirements:
    • The power subsystem is mentioned as providing stable, configurable, and multiple power supplies to the system, allowing for easy adjustment through software programming according to different application cases.
    • However, the paper does not specify the actual power consumption per node or the total power requirements for the full radar system.

VPX (Virtual Path Cross-Connect), also known as VITA 46, is a set of standards for connecting components of a computer (known as a computer bus), commonly used by defense contractors. Some are ANSI standards such as ANSI/VITA 46.0–2019. VPX provides VMEbus-based systems with support for switched fabrics over a new high speed connector. Defined by the VMEbus International Trade Association (VITA) working group starting in 2003, it was first demonstrated in 2004, and became an ANSI standard in 2007. The VPX standard was updated in 2013 and 2019.[5] Technologies in VPX include:

  • Both 3U and 6U formats
  • New 7-row high speed connector rated up to 6.25 Gbit/s
  • Choice of high speed serial fabrics
  • PMC, FMC (VITA 57), and XMC (VITA 42) mezzanines
  • Hybrid backplanes to accommodate VME64, VME320 VXS, and VPX boards
  • VPX - bus to bus bridges

The 3U VPX form factor is compact and extremely well suited for avionics, including UAVs, shipboard, satellite, and airborne radar and signal intelligence applications. 3U VPX dimensions are  100 mm height in a 5.25 in (133.35 mm) enclosure and 6U VPX dimensions are 233.35 mm height in a 10.5 in (266.70 mm) high enclosure. 

SECTION I. Introduction

With the enhanced quality and widespread application of the radar system including geological mapping, marine research, military surveillance etc., higher and stricter requirements have been put forward.

High-performance radar system is urgently needed by the large scale of data flow and calculation burden, especially in the real-time signal processing cases. The selection of chips, the design of processing structure and the realization of interconnection framework would all directly influence the system performance.

To achieve the universality, one is breaking the bondage of traditional mode that the design of radar system is subject to the algorithm. The other is building the universal radar system to lower the design cost, cycle. From the software aspect, universality means providing a hardware platform, on which diverse arithmetic complexity and different data granularity could perform well. From the hardware aspect, by the way of modularized design, universality could be obtained. Namely, we could design and optimize every unit of radar system such as the signal processing part, AD/DA part, storage part, respectively. Then according to the different function characteristics and design ideas, diverse radar system could be finished by extending, reconstructing or updating these modularized units. The modularized design is benefit for system universality, extension, flexibility and reconstruction, especially for saving the cost and cycle of design substantially.

Limited by requirements on the weight, volume and power consumption of the special application platform such as the space-borne, airborne, especially the UAV(unmanned Aerial Vehicle), miniaturization is necessary. Therefore, the detailed technology at board level such as the board layout or PCB routing need to be specially designed.

Dealing with the issues mentioned above, we discuss the method of designing and realizing a kind of high-performance universal miniature radar system in this paper.

SECTION II.Theory Analysis of System Structure

A. Module of Parallel Structure

Although the chip technology and processing power have increasingly enhanced, single chip still cannot satisfy the operation requirements of 10GFLOPS or even 100GFLOPS in the real-time imaging cases. Thus the parallel processing would be imperative for the sake of the high-performance. The parallel processing structure, which is mainly embodied in the chip-level and system-level parallelism, directly decides the performance of the system. The most common two kinds of the parallel processing are shown as followed. (P(processor), M(Memory))

Fig. 1. - Shared bus sturcture&distrubuted bus sturcture
Fig. 1.Shared bus sturcture&distrubuted bus sturcture

In the shared bus structure, every processor could equally visit all the space of the shared memory through high speed bus. Concretely, the shared memory could be visited by all the processors synchronously. It fits for the slim granularity, small-scale parallel processing. However, along with the increasing number of processors and the frequent data exchange among processors, bus competition would cause the bottleneck for the data communication. Meanwhile, the bus is lack of scalability, once it has been made, it is hard to be expanded.

In distributed parallel structure, every processing node has physical distributed memories. And multiple processing nodes could form the larger processing scale through the combination of the network which have the high communication bandwidth and low lingering customized communication link. The distributed parallel structure adopts to carry through wide granularity processing and could flexibly design system framework, conveniently expand or cut processors. In fact, the large-scale distributed system in the Fig. 2 may consist of multiple independent distributed systems[7].

Fig. 2. - Large-scale distribute system structure
Fig. 2. Large-scale distribute system structure

From the analysis above, it is obvious that the distributed parallel structure could meet the demands of scalability, flexibly, high-performance better. So our system is built based on the distributed parallel structure.

B. Interconnection Structure

In the large-scale complex distributed parallel structure systems, multiple data streams such as the high-speed original data streams, pre-processing data streams and the resulting data streams need real-time transactions. Meanwhile, diverse synchronization and control signal flows among multiple tasks or processors are existed. Their different transmission bandwidth and delay put forward plenty of requirements to the interconnection structure.

Therefore, in order to solve the issues of the system multiple processing nodes, the high-speed serial data interconnection and a variety of data streams, signal flows, we proposes a distributed system architecture based on VPX. It has the bandwidth up to 6.25GB/s, uses the multi-switched network structure, and supports the modules that are not compatible with various manufactures in engineering. Its appearance is of great importance for the development of the radar system.

SECTION III. System Structure

The high-performance miniature universal radar system consists of digital processing units which include the signal acquisition (AD) and storing module, signal processing module, waveform generating element(DA), power-supply module and system console. Meanwhile, there are diverse of data communication network such as high-speed serial data switching network, strict synchronization timing buses, low latency real-time control signal transfer network, etc. The entire system hardware boards use 3U VPX standard (100mm by 160mm).

Moreover, the diverse functional module are centrally mapping to the different board obeying the principle of modularization and reconfiguration. It makes the updating and reconstructing of the system easier than ever.

A. AD/DA High Frenquency Module

  • In the ADC module, there are 2 pieces of AD, each of them could reach 1.8GSPS(1 channel), 3.6GSPS(2 channels). Therefore, as for the AD module, it could reach 7.2GSPS(4 channels) 12bits precision totally to achieve the system data sampling and collection.

  • In the DAC module, it presents 2 pieces of DA which are responsible for waveform generation, each of them has transmission rate up to 2.5GSPS and 14bits precision, the DA module in the system could reach 5GSPS (2 channels) 14bits.

Fig. 3. - AD board&storage board
Fig. 3.

AD board&storage board

B. Storage Subsystem

There are 2 Storage Board, each of them has 1GB/s storage rate and 2T storage capability. These key features enable the radar system better used in plenty application cases.

C. Signal Processing Unit

  • DSP+FPGA Structure

In radar signal processing system, there are always some huge but relatively simple operations such as multiply, accumulation or FFT. Their strict demand of the processing speed and the computation complexity are suitable to be realized by the FPGA. Correspondingly, as for the complex arithmetic implementation, it adapts to be realized by DSP chip which is in high operation speed, has flexible manner of searching for address and powerful correspondence mechanism. Therefore, we choose the DSP+FPGA structure to finish the signal processing. FPGA is responsible for the pre-processing, DSP accomplishes most part of the high-level calculating.

  • Processing Capability

We selects TMS320C6678 as the basic processing node in the processing element, which is a high performance fixed-point and floating-point DSP based on the C66x series. It integrates eight C66x cores into a device, with each core running at a speed up to 1.25GHz, so the device can reach a peak performance of 320GMACS or 160GFLOPS[3]. As coprocessor, we selects the large scaled FPGA V6 series XC6VLX240T which is produced by Xilinx Inc. It can integrate 768 dedicated multiplier. A single signal processing plate consists of 2 pieces of DSP, each of them has 2GB DDR3 memory storage power.

Fig. 4. - Signal processign board
Fig. 4.

Signal processign board

D. Multi-Layer Interconnectin

Though system-level high speed serial networks, rich interfaces of the processing nodes, the synchronization timing, control bus and the external interfaces connecting to the RF component or PC host, the radar system comprises different layer networks. The multi-layer interconnection satisfies the corresponding transmission requirements of different type data streams and the instruction flows.

Fig. 5. - Structrue of radar system
Fig. 5. Structure of radar system

The task of the connection to the RF components is performed by the RS422 bus connected with AD module and the analog sampling interface connected with DA module. Likewise, the communication through the GbE bus between the FPGA1 on the interface board and the PC could implement the human-computer interactions and rapid display of the processing results such as the real-time SAR/GMTI image, GMTI detection result, etc.

The configurable high-speed serial RadipIO(6. 25Gbps) and PCIE(5Gbps) networks are mainly used to implement the transaction of the original data, pre-processing data, resulting data and control instructions. It supplies the sufficient bandwidth for processing elements and storages. As shown in Fig. 5, every module in the system including the storage module, signal processing module, AD/DA module connects to the FPGA/(named as Link Switch) on the interface board through SRIO, PCIE. This makes the data among every module of the system could be exchanged through the SRIO and PCIE networks under the control of Link Switch. At the same time, main controlling module could totally realize the controlling on every module of the system through this layer of interconnection.

The DSP TMS320C6678 has rich peripheral interfaces, which ensure flexible data transmission whether among the board or between the boards. Apart from the realized SRIO, PCIE interfaces, GbE and GPIO of every DSP have connected to the Link Switch(FPGA1) on the interface board. The tight interconnection between the multiple DSP processors and Link Switch makes the transiting of the resulting data more flexible. In addition, two pieces of 6678 which are on the same processing board couple together through their Hyperlink interface(with 12.5Gbps transmission rate). Moreover, this interconnection structure makes it possible to form virtual processing nodes and be suitable for large-scale calculating occasions.

The synchronization timing network and the control bus performed by bottom board, are used respectively to realize the strict synchronization timing control and underlying hardware control instruction among each part of the system.

Their sound cooperation guarantees the entire system to be in ideal state and to work together at the same pace.

In general, the multi-layer interconnection provides an excellent solution to the complex connection needs. Meanwhile, it guarantees the realization of the high-performance universal miniature radar system.

E. Power Subsystem

The power subsystem provides the system stable, configurable, multiple power supply, which could not only assure the system operation normal, but also could be easily adjusted by the software programming according to different application cases.

F. Display and Console Software

In order to achieve the result display and console, related soft platforms have been designed.

Fig. 6. - Display&controlling software
Fig. 6. Display&controlling software

SECTION IV. Other Design

A. Heat Dissipation Design

Both cooler heat conductivity and forced-air-cool are used for the heat dissipation design. In addition, experiments have shown that the cooler can bear vibration and ruggedize the system boards.

Fig. 7. - Systme heat dissipation design
Fig. 7. Systme heat dissipation design

B. Assistant Debugging Software

With the purpose of simplifying debugging and accelerating development process, we have also written a series of assistant debug software tools.

Fig. 8. - Assistant debugging software
Fig. 8. Assistant debugging software

SECTION V. Aplication

With the high-performance universal miniature radar system, different applications have been implemented including the GMTI, real-time airborne SAR imaging and the real-time space-borne SAR imaging. Then we would present the implement of the well-performing airborne real-time multi-mode multi-polarization high-performance SAR imaging system.

A. Imaging Alogithm

Concerning the air-borne platform, especially the UAV (unmanned Aerial Vehicle) platform which could be easily influenced by the air-stream or the other bad flight environment and the calculating complexity, stability, an improved RD algorithm [2] has been chose to adopt the real-time imaging platform.

Fig. 9. - Imaging algirithm
Fig. 9. Imaging algirithm

B. Data Flow

In the real-time system, the speed of signal processing should be faster than that of the signal acquisition to ensure that all the continuous echo data could be processed. As a result, the virtual single node is realized to meet the real-time requirement and the huge amount of echo data in the high-resolution SAR mode. In our system, the signal processing module contains 2 pieces of processing board, and each of them has been regarded as a virtual single node. Each of the virtual single node consists of two DSPs connected by the Hyperlink. The processing procedure is shown in the Fig. 10. Ping node executes the same program as Pong node and processes a data frame independently. These nodes process in pipeline parallel method. Meanwhile, inside of the processing node, each DSP process a portion of the whole data frame in parallel. The data flow of the system processing is shown in the Fig. 11. These structure brings the matching quality of processing, transmitting and the memory capability into play fully.

  • Spotlight 0.2m*0.2m single polarization mode

Fig. 10. - Processign structure
Fig. 10.

Processign structure

Fig. 11. - Data flow
Fig. 11. Data flow

In the large-scale spotlight mode, each of the processing node finishes one real-time image of 32K*16K (4GB) complex points with 23.89s.

  • Detailed strip 0.3m*0.3m dual-polarization mode

In the detailed strip mode, each of processing node finishes a dual-polarization real-time image and each of the DSP is responsible for one polarization image(16K * 16K complex points, 4GB total for the dual-polarization image) complex points with 23.76s.

Fig. 12. - 0.2m*0.2m spotlight&0.3m*0.3m strip mode fusion images
Fig. 12. 0.2m*0.2m spotlight&0.3m*0.3m strip mode fusion images

C. Resources Utilization

The utilization ratios of multiple hardware resources under different system working modes are demonstrated in Fig. 13. According to the Fig. 13, it is known that processing capability, working speed of AD/DA, I/O module usage and the high-speed bus bandwidth are not used adequately. Thus the system are scalable to fit more complex algorithms and imaging modes.

Fig. 13. - Hardware resources utilization ratio
Fig. 13. Hardware resources utilization ratio

SECTION VI.Conclusion

Based on the distributed parallel structure and 3U VPX connection standard, our system designed a high-performance universal miniature radar system. The system has the characters of multi-layer interconnection, standardization, modularization, scalability, restructure. In practice, the high-resolution real-time SAR has performed well based on our system. Therefore, the virtues of the system could be validated.

ACKNOWLEDGMENT

The author would like to extend the sincere thanks to Professor Hong-Xian Wang, for his patient guidance. Special thanks should go to all members of the project team who have put a lot of efforts into the project. It is my great honor to be a member of them. The paper belongs to them too.



Abstract: This paper proposes the design and realization of a high-performance universal miniature radar system. It presents a well solution to the main challenges of the radar system including extremely huge data flow and calculating burden, the traditional custom-built pattern of radar system, and the strict limitations for the size, weight and power consumption of the airborne or space-borne real-time Synthetic Aperture Radar(SAR) signal processing systems. The system has showed the virtues of standardization, modularization, stability, reconstruction, good adaptability due to the combined application of the distributed parallel architecture, latest interconnection standard and processor. By the successful application cases of airborne SAR/GMTI and space-borne imaging, its high-performance universality and miniature property could be adequately proved. 

keywords: {Real-time systems;Digital signal processing;Radar signal processing;Synchronization;Radar imaging;High-performance;miniature;universal;radar system;distributed parallel processing},

URL: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8059594&isnumber=8059124


Date of Conference: 10-13 October 2016
Date Added to IEEE Xplore: 05 October 2017
ISBN Information:
Publisher: IEEE
Conference Location: Guangzhou, China
SECTION I.

Introduction

With the enhanced quality and widespread application of the radar system including geological mapping, marine research, military surveillance etc., higher and stricter requirements have been put forward.

High-performance radar system is urgently needed by the large scale of data flow and calculation burden, especially in the real-time signal processing cases. The selection of chips, the design of processing structure and the realization of interconnection framework would all directly influence the system performance.

To achieve the universality, one is breaking the bondage of traditional mode that the design of radar system is subject to the algorithm. The other is building the universal radar system to lower the design cost, cycle. From the software aspect, universality means providing a hardware platform, on which diverse arithmetic complexity and different data granularity could perform well. From the hardware aspect, by the way of modularized design, universality could be obtained. Namely, we could design and optimize every unit of radar system such as the signal processing part, AD/DA part, storage part, respectively. Then according to the different function characteristics and design ideas, diverse radar system could be finished by extending, reconstructing or updating these modularized units. The modularized design is benefit for system universality, extension, flexibility and reconstruction, especially for saving the cost and cycle of design substantially.

Limited by requirements on the weight, volume and power consumption of the special application platform such as the space-borne, airborne, especially the UAV(unmanned Aerial Vehicle), miniaturization is necessary. Therefore, the detailed technology at board level such as the board layout or PCB routing need to be specially designed.

Dealing with the issues mentioned above, we discuss the method of designing and realizing a kind of high-performance universal miniature radar system in this paper.

SECTION II.

Theory Analysis of System Structure

A. Module of Parallel Structure

Although the chip technology and processing power have increasingly enhanced, single chip still cannot satisfy the operation requirements of 10GFLOPS or even 100GFLOPS in the real-time imaging cases. Thus the parallel processing would be imperative for the sake of the high-performance. The parallel processing structure, which is mainly embodied in the chip-level and system-level parallelism, directly decides the performance of the system. The most common two kinds of the parallel processing are shown as followed. (P(processor), M(Memory))

Fig. 1. - Shared bus sturcture&distrubuted bus sturcture
Fig. 1.

Shared bus sturcture&distrubuted bus sturcture

In the shared bus structure, every processor could equally visit all the space of the shared memory through high speed bus. Concretely, the shared memory could be visited by all the processors synchronously. It fits for the slim granularity, small-scale parallel processing. However, along with the increasing number of processors and the frequent data exchange among processors, bus competition would cause the bottleneck for the data communication. Meanwhile, the bus is lack of scalability, once it has been made, it is hard to be expanded.

In distributed parallel structure, every processing node has physical distributed memories. And multiple processing nodes could form the larger processing scale through the combination of the network which have the high communication bandwidth and low lingering customized communication link. The distributed parallel structure adopts to carry through wide granularity processing and could flexibly design system framework, conveniently expand or cut processors. In fact, the large-scale distributed system in the Fig. 2 may consist of multiple independent distributed systems[7].

Fig. 2. - Large-scale distribute system structure
Fig. 2.

Large-scale distribute system structure

From the analysis above, it is obvious that the distributed parallel structure could meet the demands of scalability, flexibly, high-performance better. So our system is built based on the distributed parallel structure.

B. Interconnection Structure

In the large-scale complex distributed parallel structure systems, multiple data streams such as the high-speed original data streams, pre-processing data streams and the resulting data streams need real-time transactions. Meanwhile, diverse synchronization and control signal flows among multiple tasks or processors are existed. Their different transmission bandwidth and delay put forward plenty of requirements to the interconnection structure.

Therefore, in order to solve the issues of the system multiple processing nodes, the high-speed serial data interconnection and a variety of data streams, signal flows, we proposes a distributed system architecture based on VPX. It has the bandwidth up to 6.25GB/s, uses the multi-switched network structure, and supports the modules that are not compatible with various manufactures in engineering. Its appearance is of great importance for the development of the radar system.

SECTION III.

System Structure

The high-performance miniature universal radar system consists of digital processing units which include the signal acquisition (AD) and storing module, signal processing module, waveform generating element(DA), power-supply module and system console. Meanwhile, there are diverse of data communication network such as high-speed serial data switching network, strict synchronization timing buses, low latency real-time control signal transfer network, etc. The entire system hardware boards use 3U VPX standard (100mm by 160mm).

Moreover, the diverse functional module are centrally mapping to the different board obeying the principle of modularization and reconfiguration. It makes the updating and reconstructing of the system easier than ever.

A. AD/DA High Frenquency Module

  • In the ADC module, there are 2 pieces of AD, each of them could reach 1.8GSPS(1 channel), 3.6GSPS(2 channels). Therefore, as for the AD module, it could reach 7.2GSPS(4 channels) 12bits precision totally to achieve the system data sampling and collection.

  • In the DAC module, it presents 2 pieces of DA which are responsible for waveform generation, each of them has transmission rate up to 2.5GSPS and 14bits precision, the DA module in the system could reach 5GSPS (2 channels) 14bits.

Fig. 3. - AD board&storage board
Fig. 3.

AD board&storage board

B. Storage Subsystem

There are 2 Storage Board, each of them has 1GB/s storage rate and 2T storage capability. These key features enable the radar system better used in plenty application cases.

C. Signal Processing Unit

  • DSP+FPGA Structure

In radar signal processing system, there are always some huge but relatively simple operations such as multiply, accumulation or FFT. Their strict demand of the processing speed and the computation complexity are suitable to be realized by the FPGA. Correspondingly, as for the complex arithmetic implementation, it adapts to be realized by DSP chip which is in high operation speed, has flexible manner of searching for address and powerful correspondence mechanism. Therefore, we choose the DSP+FPGA structure to finish the signal processing. FPGA is responsible for the pre-processing, DSP accomplishes most part of the high-level calculating.

  • Processing Capability

We selects TMS320C6678 as the basic processing node in the processing element, which is a high performance fixed-point and floating-point DSP based on the C66x series. It integrates eight C66x cores into a device, with each core running at a speed up to 1.25GHz, so the device can reach a peak performance of 320GMACS or 160GFLOPS[3]. As coprocessor, we selects the large scaled FPGA V6 series XC6VLX240T which is produced by Xilinx Inc. It can integrate 768 dedicated multiplier. A single signal processing plate consists of 2 pieces of DSP, each of them has 2GB DDR3 memory storage power.

Fig. 4. - Signal processign board
Fig. 4.

Signal processign board

D. Multi-Layer Interconnectin

Though system-level high speed serial networks, rich interfaces of the processing nodes, the synchronization timing, control bus and the external interfaces connecting to the RF component or PC host, the radar system comprises different layer networks. The multi-layer interconnection satisfies the corresponding transmission requirements of different type data streams and the instruction flows.

Fig. 5. - Structrue of radar system
Fig. 5.

Structrue of radar system

The task of the connection to the RF components is performed by the RS422 bus connected with AD module and the analog sampling interface connected with DA module. Likewise, the communication through the GbE bus between the FPGA1 on the interface board and the PC could implement the human-computer interactions and rapid display of the processing results such as the real-time SAR/GMTI image, GMTI detection result, etc.

The configurable high-speed serial RadipIO(6. 25Gbps) and PCIE(5Gbps) networks are mainly used to implement the transaction of the original data, pre-processing data, resulting data and control instructions. It supplies the sufficient bandwidth for processing elements and storages. As shown in Fig. 5, every module in the system including the storage module, signal processing module, AD/DA module connects to the FPGA/(named as Link Switch) on the interface board through SRIO, PCIE. This makes the data among every module of the system could be exchanged through the SRIO and PCIE networks under the control of Link Switch. At the same time, main controlling module could totally realize the controlling on every module of the system through this layer of interconnection.

The DSP TMS320C6678 has rich peripheral interfaces, which ensure flexible data transmission whether among the board or between the boards. Apart from the realized SRIO, PCIE interfaces, GbE and GPIO of every DSP have connected to the Link Switch(FPGA1) on the interface board. The tight interconnection between the multiple DSP processors and Link Switch makes the transiting of the resulting data more flexible. In addition, two pieces of 6678 which are on the same processing board couple together through their Hyperlink interface(with 12.5Gbps transmission rate). Moreover, this interconnection structure makes it possible to form virtual processing nodes and be suitable for large-scale calculating occasions.

The synchronization timing network and the control bus performed by bottom board, are used respectively to realize the strict synchronization timing control and underlying hardware control instruction among each part of the system.

Their sound cooperation guarantees the entire system to be in ideal state and to work together at the same pace.

In general, the multi-layer interconnection provides an excellent solution to the complex connection needs. Meanwhile, it guarantees the realization of the high-performance universal miniature radar system.

E. Power Subsystem

The power subsystem provides the system stable, configurable, multiple power supply, which could not only assure the system operation normal, but also could be easily adjusted by the software programming according to different application cases.

F. Display and Console Software

In order to achieve the result display and console, related soft platforms have been designed.

Fig. 6. - Display&controlling software
Fig. 6.

Display&controlling software

SECTION IV.

Other Design

A. Heat Dissipation Design

Both cooler heat conductivity and forced-air-cool are used for the heat dissipation design. In addition, experiments have shown that the cooler can bear vibration and ruggedize the system boards.

Fig. 7. - Systme heat dissipation design
Fig. 7.

Systme heat dissipation design

B. Assistant Debugging Software

With the purpose of simplifying debugging and accelerating development process, we have also written a series of assistant debug software tools.

Fig. 8. - Assistant debugging software
Fig. 8.

Assistant debugging software

SECTION V.

Apllication

With the high-performance universal miniature radar system, different applications have been implemented including the GMTI, real-time airborne SAR imaging and the real-time space-borne SAR imaging. Then we would present the implement of the well-performing airborne real-time multi-mode multi-polarization high-performance SAR imaging system.

A. Imaging Alogithm

Concerning the air-borne platform, especially the UAV (unmanned Aerial Vehicle) platform which could be easily influenced by the air-stream or the other bad flight environment and the calculating complexity, stability, an improved RD algorithm [2] has been chose to adopt the real-time imaging platform.

Fig. 9. - Imaging algirithm
Fig. 9.

Imaging algirithm

B. Data Flow

In the real-time system, the speed of signal processing should be faster than that of the signal acquisition to ensure that all the continuous echo data could be processed. As a result, the virtual single node is realized to meet the real-time requirement and the huge amount of echo data in the high-resolution SAR mode. In our system, the signal processing module contains 2 pieces of processing board, and each of them has been regarded as a virtual single node. Each of the virtual single node consists of two DSPs connected by the Hyperlink. The processing procedure is shown in the Fig. 10. Ping node executes the same program as Pong node and processes a data frame independently. These nodes process in pipeline parallel method. Meanwhile, inside of the processing node, each DSP process a portion of the whole data frame in parallel. The data flow of the system processing is shown in the Fig. 11. These structure brings the matching quality of processing, transmitting and the memory capability into play fully.

  • Spotlight 0.2m*0.2m single polarization mode

Fig. 10. - Processign structure
Fig. 10.

Processign structure

Fig. 11. - Data flow
Fig. 11.

Data flow

In the large-scale spotlight mode, each of the processing node finishes one real-time image of 32K*16K (4GB) complex points with 23.89s.

  • Detailed strip 0.3m*0.3m dual-polarization mode

In the detailed strip mode, each of processing node finishes a dual-polarization real-time image and each of the DSP is responsible for one polarization image(16K * 16K complex points, 4GB total for the dual-polarization image) complex points with 23.76s.

Fig. 12. - 0.2m*0.2m spotlight&0.3m*0.3m strip mode fusion images
Fig. 12.

0.2m*0.2m spotlight&0.3m*0.3m strip mode fusion images

C. Rescoures Utilization

The utilization ratios of multiple hardware resources under different system working modes are demonstrated in Fig. 13. According to the Fig. 13, it is known that processing capability, working speed of AD/DA, I/O module usage and the high-speed bus bandwidth are not used adequately. Thus the system are scalable to fit more complex algorithms and imaging modes.

Fig. 13. - Hardware resources utilization ratio
Fig. 13.

Hardware resources utilization ratio

SECTION VI.

Conclusion

Based on the distributed parallel structure and 3U VPX connection standard, our system designed a high-performance universal miniature radar system. The system has the characters of multi-layer interconnection, standardization, modularization, scalability, restructure. In practice, the high-resolution real-time SAR has performed well based on our system. Therefore, the virtues of the system could be validated.

ACKNOWLEDGMENT

The author would like to extend the sincere thanks to Professor Hong-Xian Wang, for his patient guidance. Special thanks should go to all members of the project team who have put a lot of efforts into the project. It is my great honor to be a member of them. The paper belongs to them too.



Select All
1.
Mehrdad Soumekhg, Synthetic Aperture Radar Signal Processing with MATLAB algorithm, Wiley-Interscience, April 1999.
2.
I. G. Cumming and F. H. Wong, Digital Processing of Synthetic Aperture Radar Data: Algorithms and Implementation, Norwood, MA:Artech House, 2005.
3.
Texas Instruments Inc. TMS320C6678 Multicore fixed and floating point Digital Signal Processor. Data Manual. Rev 1.0, November 2010.
4.
David E. Culler, Parallel Computer Architecture: A Hardware/Software Approach, Beijing:China Machine Press, 2002.
5.
J-Q Yuan and P-K Huang, "Design of real-time digital signal processing system based on DSP and FPGA", Journal of Systems Engineering and Electronics, vol. 26, no. 11, pp. 1561-1563, 2004.
6.
Shan-qing Hu and Teng Long, "Design and realization of high-performance universal radar signal processing system", International Conference on Signal Processing, pp. 2254-2257, Oct 26-29,2008.
7.
Tao Su, Xue-hui He and Lin-xia Liu, Real-Time Signal Processing System, Xidian University Press, 2006.

 

 

No comments:

Post a Comment

When RAND Made Magic + Jason Matheny Response

Summary The article describes RAND's evolution from 1945-present, focusing on its golden age (1945-196...