VLSI implementation of high speed and high resolution FFT algorithm based on radix 2 for DSP application

Similar documents
Design and Implementation of Pipelined Floating Point Fast Fourier Transform Processor

A Novel VLSI Based Pipelined Radix-4 Single-Path Delay Commutator (R4SDC) FFT

THE combination of the multiple-input multiple-output

International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE) Volume 6, Issue 6, June 2017

Implementing Efficient Split-Radix FFTs in FPGAs

We are IntechOpen, the first native scientific publisher of Open Access books. International authors and editors. Our authors are among the TOP 1%

High Speed Reconfigurable FFT Processor Using Urdhava Thriyambakam

CHAPTER 7 APPLICATION OF 128 POINT FFT PROCESSOR FOR MIMO OFDM SYSTEMS. Table of Contents

Floating Point Fast Fourier Transform v2.1. User manual

An Area Efficient 2D Fourier Transform Architecture for FPGA Implementation

International Journal of Engineering Research-Online A Peer Reviewed International Journal

A 128-Point FFT/IFFT Processor for MIMO-OFDM Transceivers a Broader Survey

A 0.75 Million Point Fourier Transform Chip for Frequency Sparse Signals

Implementation of High Throughput Radix-16 FFT Processor

by Ray Escoffier September 28, 1987

A Novel Architecture for Radix-4 Pipelined FFT Processor using Vedic Mathematics Algorithm

A Comparative Study of Different FFT Architectures for Software Defined Radio

An FPGA Spectrum Sensing Accelerator for Cognitive Radio

Laboratory Exercise #6

AN10943 Decoding DTMF tones using M3 DSP library FFT function

Advanced Digital Signal Processing Part 4: DFT and FFT

CRUSH: Cognitive Radio Universal Software Hardware

Cadence Design of clock/calendar using 240*8 bit RAM using Verilog HDL

HB0267 CoreFFT v7.0 Handbook

Design and Implementation of Real-Time 16-bit Fast Fourier Transform using Numerically Controlled Oscillator and Radix-4 DITFFT Algorithm in VHDL

Low Complexity FFT/IFFT Processor Applied for OFDM Transmission System in Wireless Broadband Communication

EFFICIENT ARCHITECTURE FOR PROCESSING OF TWO INDEPENDENT DATA STREAMS USING RADIX-2 FFT

OpenCL for FPGAs/HPC Case Study in 3D FFT

Automatic Customer Counter and Payment Tool for Shopping Centers and commercial spaces

Application Note 35. Fast Fourier Transforms Using the FFT Instruction. Introduction

A Novel RTL Architecture for FPGA Implementation of 32- Point FFT for High-Speed Application

COST REDUCTION OF SECTION CHANNEL BY VALUE ENGINEERING

REAL-TIME PROCESSING OF A LONG PERIMETER FIBER OPTIC INTRUSION SYSTEM

BOILED WATER TEMPERATURE MEASUREMENT SYSTEM USING PIC MICROCONTROLLER

Speech parameterization using the Mel scale Part II. T. Thrasyvoulou and S. Benton

Smart Gas Booking and LPG Leakage Detection System

A High-Throughput Radix-16 FFT Processor With Parallel and Normal Input/Output Ordering for IEEE c Systems

IOT Based Intelligent Bin for Smart Cities

A Novel Coefficient Ordering based Low Power Pipelined Radix-4 FFT Processor for Wireless LAN Applications

Reconfigurable FPGA-Based FFT Processor for Cognitive Radio Applications

Easily Testable and Fault-Tolerant FFT Butterfly Networks

A Continuous-Flow Mixed-Radix Dynamically-Configurable FFT Processor

DIGITAL SECURITY SYSTEM USING FPGA. Akash Rai 1, Akash Shukla 2

False Alarm Analysis of the CATM-CFAR in Presence of Clutter Edge

Fin-Fan Control System

An Experimental Study on Clothes Drying Using Waste Heat from Split Type Air Conditioner

A compression algorithm for field programmable gate arrays in the space environment

Reconfigurable Computing Lab 02: Seven-Segment Display and Digital Alarm Clock

HUMIDITY CONTROL SYSTEM FOR ROLLING AND FERMENTING ROOM OF A TEA FACTORY

Intelligent Fire Detection and Visual Guided Evacuation System Using Arduino and GSM

The Earthquake Monitoring System for Dams of a reservoir in China SUMMARY

Equipment Required. WaveRunner Zi series Oscilloscope 10:1 High impedance passive probe. 3. Turn off channel 2.

Design & Control of an Elevator Control System using PLC

A Design of IADSS with the Earthquake Detecting Function

GSM BASED GARBAGE AND WASTE COLLECTION BIN OVERFLOW INDICATOR

Advanced Hardware and Software Technologies for Ultra-long FFT s

AUTOMATIC STREET LIGHT CONTROL SYSTEM USING MICROCONTROLLER

Implementation of Auto Car Washing System Using Two Robotic Arms

Hazardous Gas Detection using ARDUINO

Assignment 5: Fourier Analysis Assignment due 9 April 2012, 11:59 pm

Fire Detection Using Image Processing

Improving Heating Performance of a MPS Heat Pump System With Consideration of Compressor Heating Effects in Heat Exchanger Design

INSTRUMENTATION AND EVALUATION OF COMMERCIAL AND HOMEMADE PASSIVE SOLAR PANELS

Algorithmic Generation of Chinese Lattice Designs

Wearable Carbon Monoxide Warning System Using Wireless Sensors

LPG GAS LEAKAGE DETECTION, MONITORING AND CONTROL USING LabVIEW

Analysis of Constant Pressure and Constant Area Mixing Ejector Expansion Refrigeration System using R-1270 as Refrigerant

Microgrid Fault Protection Based on Symmetrical and Differential Current Components

AV6419 BOTDR. Product Overview. Main Features

New Seismic Unattended Small Size Module for Foot-Step Detection

CURRICULUM VITAE. as of Doçent, Algebra and Number Theory 2013 YÖK, Higher Education Council of Turkey

Methodology of Implementing the Pulse code techniques for Distributed Optical Fiber Sensors by using FPGA: Cyclic Simplex Coding

MOSTAFA BATOULI. 171 Moultrie Street, Charleston, SC PHONE: (+ 1) E -MAI L:

Dynamic Simulation of Double Pipe Heat Exchanger using MATLAB simulink

Heat Transfer Enhancement using Herringbone wavy & Smooth Wavy fin Heat Exchanger for Hydraulic Oil Cooling

An improved Algorithm of Generating Network Intrusion Detector Li Ma 1, a, Yan Chen 1, b

Proceedings Design, Fabrication and Optimization of a Silicon MEMS Natural Gas Sensor

University of Siena, Via Roma. 56, Siena Italy Visual Information Processing and Protection (VIPP)

Performance Evaluation on FFT Software Implementation

OVERVIEW AND FUTURE DEVELOPMENT OF THE NEUTRON SENSOR SIGNAL SELF-VALIDATION (NSV) PROJECT

Range Dependent Turbulence Characterization by Co-operating Coherent Doppler Lidar with Direct Detection Lidar

AUTOMATIC OIL LEAK DETECTION CONVEYED OVER GSM R.SUNDAR. Assistant Professor, Department of EEE Marine, AMET University, Chennai, Tamil Nadu, India

How Computer Simulation Helps to Design Induction Heating Systems

International Journal of Scientific Research and Reviews

Application of a Controlled Outside Cold Airflow by a PID Controller to Improve the Performance of a Household Refrigerator

Subscripts 1-4 States of the given system Comp Compressor Cond Condenser E Evaporator vol Volumetric G Gas L Liquid

Title Page. Report Title: Downhole Power Generation and Wireless Communications. for Intelligent Completions Applications

Exercise 8. Controlling a Batch Mixing Process EXERCISE OBJECTIVE

Performance Comparison of Ejector Expansion Refrigeration Cycle with Throttled Expansion Cycle Using R-170 as Refrigerant

Key Event Spaces Use Protocols. Facilities and Services Group

15 th International Conference on Automatic Fire Detection PROCEEDINGS. October 14-16, 2014 Universität Duisburg-Essen Duisburg, Germany

A Study on Cooling Rate with Blade and Sound Fire Extinguisher

Multi-sensor System For Indoor Environment Monitoring Zheng-yu Wanga, and Xiao-ru Zhangb

ABSTRACT. KEYWORDS: New Combination, New Product Development, Open Innovation, IoT (Internet of Things), Auto Alert System

A Forest Fire Warning Method Based on Fire Dangerous Rating Dan Wang 1, a, Lei Xu 1, b*, Yuanyuan Zhou 1, c, Zhifu Gao 1, d

Design and Development of Industrial Pollution Monitoring System using LabVIEW and GSM

PowerRouter application guideline

Dr. Rita Jain Head of Department of ECE, Lakshmi Narain College of Tech., Bhopal, India

[Patil* et al., 5(7): July, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116

Microcontroller based design for Tele-sensing and extinguishing fire

Transcription:

Neonode Inc From the Selectedors of Dr. Rozita Teymourzadeh, CEng. 2007 VLSI implementation of high speed and high resolution FFT algorithm based on radix 2 for DSP application Nooshin Mahdavi Rozita Teymourzadeh Masuri Othman Available at: https://wors.bepress.com/rozita_teymourzadeh/7/

The 5 th Student Conference on Research and Development SCOReD 2007 11-12 December 2007, Malaysia VLSI Implementation of High Speed and High Resolution FFT Algorithm Based on Radix 2 for DSP Application N. Mahdavi, R. Teymourzadeh, IEEE Student Member, Masuri Bin Othman Abstract-- Using Fast Fourier Transform (FFT) is indispensable in most signal processing applications. Designing an appropriate algorithm for the implementation of FFT can be efficacious in digital signal processing. Sophisticated techniques such as pipelining and parallel calculations have potential impacts on VLSI implementation of FFT algorithm. Furthermore, a mathematic approach such as floating point calculation achieves higher precision. In this paper, an efficient algorithm with using parallel and pipelining methods is proposed to implement high speed and high resolution FFT algorithm. Latency reduction is an important issue to implement the high speed FFT on FPGA. The Proposed FFT algorithm shows the latency of 5131 cloc pulse when N refers to 1024 points. The design has the mean squared error (MSE) of 0.0001 which is preferable to Radix 2 FFT. F Index Terms Radix, FFT, Butterfly, VLSI, Floating point. I. INTRODUCTION ast Fourier transform is a faster version of DFT which was invented by Cooley & Tuy in 1965 [1]. There are several types of FFT algorithms. Radix-2 is a practical algorithm to calculate FFT by using decimation in time (DIT) method. The main idea of decimation in time is to sunder the input v string of h with length of N 2 into two substrings with n odd and even indexes. The N-point DFT of hn can be achieved by suitable combining to N 2 -point DFT of each substring. Each one of them can be reduced to N 4 -point DFTs. This procedure will continue till only 2-point DFTs remains [2]. The final result is the formation of FFT algorithm. In 2006 petrovsy [3] implemented Radix 2-4 based on parallel-pipeline FFT-processors at structural level. He achieved 0.00201 MSE on his FFT algorithm. This paper shows the implementation of high speed and high resolution N. Mahdavi is with the Department of Electrical Engineering, Ara University, Iran (e-mail: mahdavi_n2@yahoo.com). R. Teymourzadeh is with Department of Electrical Engineering, Bangi, Malaysia (e-mail: rozita60@vlsi.eng.um.my). Masuri Bin Othman is with Department of Electrical Engineering, Bangi, Malaysia (e-mail: masuri@vlsi.eng.um.my). 1-4244-1470-9/07/$25.00 2007 IEEE. FFT based on parallel pipeline and floating point Radix-2 with the minimum MSE of 0.0001. The next section describes the principle of the FFT structure. The mathematical formulation and bloc diagram of pipeline FFT algorithm is described in section III. Section IV shows implementation and design result in brief. Finally, conclusion is expressed in section V. II. PRINCIPLE OF THE FFT STRUCTURE The computation of N-point FFT via decimation-in-time N log N complex multiplication O 2 algorithm requires time [7] and it will be increased if basic butterfly computation is used for adder and multiplication in decimation-in-time FFT algorithm. Additionally a complicated controller is also required. Furthermore, implementing fixed point FFT algorithm, results the output differs significantly in comparison to the expected output. To obtain the high speed and high resolution FFT algorithm, implementation of the floating point pipeline FFT is applied. The proposed method shows high performance of the FFT algorithm. The principle architecture is based on using a memory to eep input and output data. Thus, it reduces hard ware complexity. All the blocs are designed to operate with the same cloc frequency. These stages implement as pipeline and parallel to reduce FFT calculations time. III. FFT BLOCK DIAGRAM DESCRIPTION Figure 1 shows the main bloc diagram of the FFT algorithm. Each bloc will be investigated separately. A. Radix-2 Butterfly Algorithm First, the RAM is initialized by external microprocessor and the data are loaded to the RAM by bit reserve address pin. The Radix 2 Butterfly bloc is the primary prefatory step taen in DIT FFT [8,9]. This bloc calculates the complex number in equation 1 and 2 as follow. output 1 input1 input 2 (1) output 2 input1 input 2 (2)

Fig. 3. Pipeline floating point multiplier Fig. 1. The FFT bloc diagram Hence, one complex multiplication and two complex additions are required in this calculation. If output1 01 iyo 1, output 2 o 2 iyo2 and iy we have: w i iy i iy o1 o1 1 iy Y Y (3) Y Y 1 i2 iy Y Y (4) o2 o2 i1 Y Y i1 i2 Thus, each butterfly requires four real multiplications and six real additions. If the above equations are implemented using fixed point calculations, the error possibility of the result will increase. These errors [4] consist of round-off, overflow and coefficient quantization errors. To reduce the errors and provide high resolution, floating point adder/subtractor is replaced. This adder/subtractor operates in 3 stages including mantis alignment, mantis addition/ subtraction and normalization. Thus, each of the above phases can be used as stages for pipeline implementation. e convert the first phase into two stages in order to achieve higher cloc frequency [5], [6]. Figures 2 and 3 show the bloc diagram of the pipeline adder/subtractor and multiplier. It is not necessary to adjust the exponents in multiplier. Since the normalized inputs are loaded to the multiplier, the mantises are multiplied and the exponents are added and amended. The result may be shifted one bit leftward. Thus the exponent will be decremented one unit. Fig. 2. Pipeline floating point adder/ subtractor Fig. 4. Internal schematic of the pipeline butterfly algorithm Figure 4 shows the internal schematic of the pipeline butterfly algorithm. To improve speed calculations in the Radix-2 butterfly algorithm, the pipeline registers are located after each addition, subtraction and multiplication blocs. Hence, the pipeline butterfly algorithm eeps the final result in the register to transfer it to the RAM by the next cloc cycle. This FFT algorithm is exactly executed after N 2 Log2 N cloc pulses. The 11 cloc pulses delay 11 is created by 11 pipeline registers in adder, subtractor and multiplier in a serial butterfly bloc. Additionally, parallel design of the FFT algorithm decreases the calculation time significantly. B. ROM and RAM Blocs The ROM in Radix-2 FFT algorithm has two loo-up tables to save coefficients. These coefficients are inherently complex numbers so the real & imaginary parts are separately ept on that. Since the amplitude of the sine and cosine are the same in the four quarters, (they only differ in the signs) are calculated just for 0 to are saved for 0 to N into the ROM. N and related sign 4

Signal Amplitude Fig. 7. The output digital signal of the system 30 25 Fig. 5. Internal structure of the RAM Thus, it avoids using a large number of gates in the FPGA board. There is a RAM in the FFT bloc diagram. According to the system worflow, two complex data must be read from the RAM simultaneously and loaded to the butterfly bloc. Meanwhile, the two outputs of the butterfly bloc have to be written in the RAM at the same time. Figure 5 shows the internal structure of the RAM with the capability of reading and writing the two complex data simultaneously. IV. IMPLEMENTATION RESULT The new architecture of the Radix-2 FFT algorithm Verilog code was written and simulated by Matlab software. The design code is downloaded to the Virtax-2 FPGA board. From ilinx ISE synthesize report, it was shown minimum cloc period is 9.94 ns (Maximum Frequency is 100.6 MHz). The chip layout on Virtex 2 FPGA board is shown in Figure 6. Figure 7 shows the output digital signal of the new Radix-2 FFT algorithm which simulated by Modelsim software. Figure 8 shows the measured signal spectrum achieved by Radix-2 FFT algorithm for 1024 floating point complex data. 20 15 10 5 0 0 2000 4000 6000 8000 10000 12000 14000 Frequency (Hz) Fig. 8. Signal spectrum of high speed radix-2 FFT algorithm The output result appeared after 5131 cloc pulses which proved the complex calculation of N 2 Log2 N11 the system. The System MSE was measured by Matlab software. It shows the mean squared error (MSE) of 0.0001. V. CONCLUSION The new architecture of the high speed and high resolution of the parallel, pipeline and floating point Radix-2 FFT algorithm was designed and investigated. High Speed FFT architecture was obtained by two methods. The pipeline structure and parallel design lead us to have high speed FFT algorithm. Additionally, using a internal RAM maes the design compatible with different type of FPGA board. The implementation result shows the maximum throughput of the 100.6 MHz in Virtex 2 FPGA board. Meanwhile, the resolution was increased by floating point calculation during the FFT process. The proposed FFT algorithm proves the latency of N 2 Log2 N11 in, 5131 cloc pulses for N 1024 with the mean squared error (MSE) of 0.0001. Fig. 6. The core layout on FPGA board VI. ACKNOLEDGMENT e wish to than our friend, Dr. Mohamad Amir Amini who has assisted us for completion of this dissertation.

VII. REFERENCE [1] J.. Cooley, J.. Tuey, An Algorithm for the Machine Calculation of Complex Fourier series," Math of comp Vol. 19, pp. 297-301, 1965. [2] A. Saidi, Decimation-in-time frequency FFT algorithm. Acoustics, Speech, and Signal Processing, ICASS-94, IEEE International Conference. Vol. 3, pp. 453-456, 1994 [3] A.A. Petrovsy, S. L. Shredov, Automatic Generation of Split-Radix 2-4 Parallel-Pipeline FFT Processors: Hardware Reconfiguration and Core Optimization, Proceeding of the international symposium on parallel computing in Electrical engineering (PARELEC 06), pp.181-186, 2006. [4] E. C. Ifeachor, B. V. Jervis, Digital Signal Processing: A Practical Approch. Second Editiond. Prantice Hall, 2002. [5] M. Morris Mano, Computer System Architectur. Third Edition. Prantice Hall. 1992. [6] D. A. Patterson & J. L. Hennssy. Computer Organization & Design: the Hardware/ Software Interface. Second Edition. Morgan Kaufman Publishere Inc. 1998. [7] Lo sing Cheng, A. Miri, Tet Hinb Yeap, Efficient FPGA Implementation of FFT based Multipliers, IEEE Conference Electrical and Computer Engineering, pp. 1300-1303, 2005. [8] K. S. Hemmert, K. D. Underwood, An Analysis of The Double- Precision Floating-point FFT on FPGAs, 13 th IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 171-180, April 2005. [9] Shiqun Zheng, Dunshan Yu, Design and implementation of a parallel real-time FFT processor, 7 th IEEE conference on Solid-State and Integrated Circuits Technology, Vol. 3, pp. 1665-168, Oct 2004. His employment experience included the head of department of Electrical, Electronic and System in UKM, vice dean of engineering faculty, vice director of IMEN and the member of senate professor for IRPA application in MOSTI. He joined MIMOS Co. in 2006 as the head of Microsystems & MEMs Centre in Malaysia. VIII. BIOGRAPHIES Nooshin Mahdavi was born in Tehran, Iran, on September 6, 1981. She received the B.S. degree in electrical engineering from Islamic Azad University of Garmsar, Iran, in 2003 and the M.S. degree in electrical engineering from Islamic Azad University of Ara, Ara, Iran in 2006. Her employment experience included the design of the handheld and wireless digital systems in Sanjesh Afzar Asia Co., Tehran, Iran, and teaching in Islamic Azad University of Tafresh, Tafresh, Iran, and teaching in Islamic Azad University of Yadegare Imam, Tehran, Iran. Her special fields of interest included the design of digital systems using Microprocessors and FPGAs. Rozita Teymourzadeh was born on 18 th April 1981, in Iran. She got her Diploma in Mathematics from Mehre Madar high school in 1999, Tehran. In 2003 she received her Bachelor of Science (B.Sc) in Electronic systems from Garmsar University in Iran and the M. S. degree in Faculty of Electrical & Engineering in National University of Malaysia (UKM) in 2006. hile studying, she has 7 years employment experience in power substation field and official wor as designer in Tehran and she has good bacground in Plug and switch system (PASS) and Compact prefabricated AIS substation (COMPASS). Meanwhile, she is woring in institute of microengineering and nanoelectronic labratory (IMEN) as system on chip (SoC) researcher in Malaysia. Currently, she joined Faculty of Electrical & Engineering in UKM as a PhD. student to embar research in the field of Very Large Scale Integration (VLSI) design under Professor Dr. Masuri Bin Othman. Masuri Bin Othman was born in Jitra, Kedah, Malaysia, on July 11, 1957. He graduated from the National University of Malaysia (UKM) in Department of Electrical, Electronic and System in 1978. He got his master degree in university of Essex, England in 1980 and his PhD. in Southampton university of England in the specialized Microelectronic field in 1986.