A 0.75 Million Point Fourier Transform Chip for Frequency Sparse Signals

Similar documents
Faster GPS via the Sparse Fourier Transform. Haitham Hassanieh Fadel Adib Dina Katabi Piotr Indyk

Faster GPS via the Sparse Fourier Transform. Haitham Hassanieh Fadel Adib Dina Katabi Piotr Indyk

Implementing Efficient Split-Radix FFTs in FPGAs

CHAPTER 7 APPLICATION OF 128 POINT FFT PROCESSOR FOR MIMO OFDM SYSTEMS. Table of Contents

by Ray Escoffier September 28, 1987

Floating Point Fast Fourier Transform v2.1. User manual

Advanced Digital Signal Processing Part 4: DFT and FFT

Laboratory Exercise #6

High Speed Reconfigurable FFT Processor Using Urdhava Thriyambakam

Implementation of High Throughput Radix-16 FFT Processor

HB0267 CoreFFT v7.0 Handbook

A Continuous-Flow Mixed-Radix Dynamically-Configurable FFT Processor

Design and Implementation of Pipelined Floating Point Fast Fourier Transform Processor

A Novel VLSI Based Pipelined Radix-4 Single-Path Delay Commutator (R4SDC) FFT

An FPGA Spectrum Sensing Accelerator for Cognitive Radio

CRUSH: Cognitive Radio Universal Software Hardware

A Novel Architecture for Radix-4 Pipelined FFT Processor using Vedic Mathematics Algorithm

International Journal of Engineering Research-Online A Peer Reviewed International Journal

We are IntechOpen, the first native scientific publisher of Open Access books. International authors and editors. Our authors are among the TOP 1%

AN10943 Decoding DTMF tones using M3 DSP library FFT function

THE combination of the multiple-input multiple-output

The Cell Processor Computing of Tomorrow or Yesterday?

VLSI implementation of high speed and high resolution FFT algorithm based on radix 2 for DSP application

Equipment Required. WaveRunner Zi series Oscilloscope 10:1 High impedance passive probe. 3. Turn off channel 2.

THERE is significant research activity to minimize energy

OpenCL for FPGAs/HPC Case Study in 3D FFT

A 128-Point FFT/IFFT Processor for MIMO-OFDM Transceivers a Broader Survey

International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE) Volume 6, Issue 6, June 2017

mach 5 100kHz Digital Joulemeter: M5-SJ, M5-IJ and M5-PJ Joulemeter Probes

RS485 MODBUS Module 8AI

Third-Party Research & Development Technical Discourse. Application of cirqed RA for FFT Analysis Part I Identifying Real And Aliased FFT Components

Implementing a Reliable Leak Detection System on a Crude Oil Pipeline

Advanced Hardware and Software Technologies for Ultra-long FFT s

Power Management Projects

Reconfigurable FPGA-Based FFT Processor for Cognitive Radio Applications

A High-Throughput Radix-16 FFT Processor With Parallel and Normal Input/Output Ordering for IEEE c Systems

Batt-Safe Battery String Monitoring System

ProSmart Predictive Condition Monitoring

Chapter 6. Alarm History Screen. Otasuke GP-EX! Chapter 6 Alarm History Screen 6-0. Alarm History Screen 6-1. Display Alarm History in List 6-2

TESS WIRELESS SENSOR TAG DEMO V1.1

EFFICIENT ARCHITECTURE FOR PROCESSING OF TWO INDEPENDENT DATA STREAMS USING RADIX-2 FFT

FlameGard 5 MSIR HART

TESS WIRELESS SENSOR TAG DEMO V1.1

FREQUENCY ENHANCEMENT OF DUAL-JUNCTION THERMOCOUPLE PROBES

Application Note 35. Fast Fourier Transforms Using the FFT Instruction. Introduction

An Area Efficient 2D Fourier Transform Architecture for FPGA Implementation

Programming the Cell BE

AND DYNAMIC CIRCUITS ARE USED IN CRITICAL AREAS. TIGHT COUPLING OF THE INSTRUCTION SET ARCHITECTURE, MICROARCHITECTURE, AND PHYSICAL

GOLAY COMPLEMENTARY CODES, DOUBLE PULSE REPETITION FREQUENCY TRANSMISSION I. TROTS, A. NOWICKI, M. LEWANDOWSKI J. LITNIEWSKI, W.

Speech parameterization using the Mel scale Part II. T. Thrasyvoulou and S. Benton

ARGUS SERVER SYSTEM. Inspired Systems Pty Ltd 70 Mordaunt Circuit, Canning Vale, Western Australia Ph Fax

ICS Regent. Fire Detector Input Modules PD-6032 (T3419)

Smart GPS Antenna A1035-D

Temperature + Humidity (%RH) Control & Recording System with 8 / 16 Channel Mapping. Touch Panel PPI

The new generation Teledyne NIR detectors for the SNAP/JDEM mission spectrograph.

Specifications Wavelength Range: 190 nm to 800 nm continuous

A temporal active fire detection algorithm applied to geostationary satellite observations

Samsung SDS BMS Ver.2.0. Technical Specification

Carlos Borda Omnisens S.A. Subsea Asia Conference June 2014

Distributed Rayleigh scatter dynamic strain sensing above the scan rate with optical frequency domain reflectometry

VLSI Design EEC 116. Lecture 1. Bevan M. Baas Thursday, September 28, EEC 116, B. Baas 1

Time-Frequency Analysis

On-chip optical detectors in Standard CMOS SOI

Control System Type ABS PCx COMLI/Modbus

FL500 Modbus Communication Operating Manual. Order No.: /00. MSAsafety.com

EasyLog Data Logger Series

PHYS 3152 Methods of Experimental Physics I E5. Frequency Analysis of Electronic Signals

Easily Testable and Fault-Tolerant FFT Butterfly Networks

New Seismic Unattended Small Size Module for Foot-Step Detection

Functional Safety Experience on Railway Signalling in Japan. Yuji Hirao Nagaoka University of Technology (Japan)

RS485 Integrators Guide

MicroTech Series 200 Centrifugal Chiller

HM-4201-RTCM High-Temp Real Time Clock Module

Microgrid Fault Protection Based on Symmetrical and Differential Current Components

8-Channel Thermocouple Input Ethernet Data Acquisition Module

A Comparative Study of Different FFT Architectures for Software Defined Radio

Wind power Condition Monitoring. Condition Monitoring Systems (CMS) in wind turbines CMS. Authors: Martin Kluge Michael Danitschek

for Microelectronics Packaging Dave Vallett PeakSource Analytical, LLC Fairfax, Vermont, USA

The SLAC Detector R&D Program

NETx Voyager Visualization

Laser Scan Micrometer Page 418. Laser Scan Micrometer Page 423

WHAT IS LASER ULTRASONIC TESTING?

NYC STEM Summer Institute: Climate To Go! Infrared detection of carbon dioxide

state of the art methane leak detection CHARM and GasCam 2011 October 13 th Dr. Axel Scherello

OPERATION GUIDE MODEL CERV-001-PARTA, CERV-001-PARTB CONDITIONING ENERGY RECOVERY VENTILATOR (CERV)

DIRA S.L. (Desenvolupament, Investigació i Recerca Aplicada S.L.)

Building and Characterizing 14GHz InGaAs Fiber Coupled Photodiodes

Learning from High Energy Physics Data Acquisition Systems Is this the future of Photon Science DAQ?

EasyLog Data Logger Series

Methodology of Implementing the Pulse code techniques for Distributed Optical Fiber Sensors by using FPGA: Cyclic Simplex Coding

ELECTRICAL IMPEDANCE TOMOGRAPHY

Scanning Laser Range Finder Smart-URG mini UST-10LN Specification

Design and Implementation of Real-Time 16-bit Fast Fourier Transform using Numerically Controlled Oscillator and Radix-4 DITFFT Algorithm in VHDL

How to Use Fire Risk Assessment Tools to Evaluate Performance Based Designs

2008 JINST 3 S The CMS experiment at the CERN LHC THE CERN LARGE HADRON COLLIDER: ACCELERATOR AND EXPERIMENTS.

PC474 Lab Manual 1. Terry Sturtevanta and Hasan Shodiev 2. Winter 2012

Cadence Design of clock/calendar using 240*8 bit RAM using Verilog HDL

Advanced Autoclave Controller with Recording + 4 Channel Mapping + Pressure Indication with PC Software & Printer Module

CUG2016. On Enhancing 3D-FFT. Performance in VASP. Florian Wende. Zuse Institute Berlin. Martijn Marsman, Thomas Steinke. London, UK, May 8-12

RESOLUTION MSC.86(70) (adopted on 8 December 1998) ADOPTION OF NEW AND AMENDED PERFORMANCE STANDARDS FOR NAVIGATIONAL EQUIPMENT

Transcription:

A 0.75 Million Point Fourier Transform Chip for Frequency Sparse Signals Ezz El Din Hamed Omid Abari, Haitham Hassanieh, Abhinav Agarwal, Dina Katabi, Anantha Chandrakasan, Vladimir Stojanović International Solid-State Circuits Conference 1of 42

Leverage Sparsity Output of FFT is Sparse: Most frequencies have zero energy or noise Amplitude Frequency Only few frequencies are active and have energy Use sparse FFT algorithm Find and compute the active frequencies International Solid-State Circuits Conference 2of 42

Applications Spectrum Sensing GPS Radio Astronomy GHz spectrum sensing and acquisition Pattern matching by convolution with long code (e.g. GPS) Radio astronomy: data compression International Solid-State Circuits Conference 3of 42

Wideband Spectrum Sensing 100 80 Occupancy Graph Seattle January 7, 2013 (Microsoft Spectrum Observatory) Occupancy (%) 60 40 20 0 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 Frequency (GHz) International Solid-State Circuits Conference 4of 42

Sparse FFT Algorithm Outline Micro architecture and Implementation Measurement Results Conclusion International Solid-State Circuits Conference 5of 42

How Does Sparse FFT Work? Values V1 V2 V3 0 1 2 3 4 5 6 7 8 9 10 11 Frequencies 1. Bucketize 2. Estimate 3. Resolve collisions International Solid-State Circuits Conference 6of 42

How Does Sparse FFT Work? Values V1 V2 V3 V2+V3 V1 0 1 2 3 4 5 6 7 8 9 10 11 1. Bucketize Frequencies 0 1 2 3 Buckets Sub sampling time Aliasing the frequencies International Solid-State Circuits Conference 7of 42

How Does Sparse FFT Work? Values V1 V2 V3 V2+V3 V1 0 1 2 3 4 5 6 7 8 9 10 11 2. Estimate Frequencies 0 1 2 3 Buckets International Solid-State Circuits Conference 8of 42

How Does Sparse FFT Work? Values V1 V2 V3 V2+V3 V1 0 1 2 3 4 5 6 7 8 9 10 11 2. Estimate Frequencies Repeat bucketization with a time shift 0 1 2 3 Buckets Time Domain Freq Domain Phase Rotation International Solid-State Circuits Conference 9of 42

How Does Sparse FFT Work? Values V1 V2 V3 V2+V3 V1 0 1 2 3 4 5 6 7 8 9 10 11 Frequencies 3. Resolve Collisions Repeat bucketization with a different subsampling rate (co prime) 0 1 2 3 Buckets V1+V3 V2 0 1 2 Buckets International Solid-State Circuits Conference 10 of 42

Implementation of Sparse FFT Sparse FFT size: N=3 6 x2 10 =746496 Subsample by 3 6 in time FFT of size 1024 (2 10 ) Subsample by 2 10 in time FFT of size 729(3 6 ) Sparsity: outputs up to 750 active frequencies Output resolution: 12 bit real, 12 bit imaginary International Solid-State Circuits Conference 11 of 42

Sparse FFT Chip Block Diagram Memory 1 2 10 3 6 Memory 2 2 10 3 6 Memory Interface Memory 3 2 10 3 6 Subsamples of new signal 2 10 -point FFT Scheduler Update Recovered freq. of processed signal I/O interface 3 6 -point FFT Bucketization Estimation Collision Detection Reconstruction 3 Pipelined Stages International Solid-State Circuits Conference 12 of 42

Sparse FFT Chip Block Diagram Memory 1 2 10 3 6 Memory 2 2 10 3 6 Memory Interface Memory 3 2 10 3 6 Subsamples of new signal 2 10 -point FFT Scheduler Update Recovered freq. of processed signal I/O interface 3 6 -point FFT Bucketization Estimation Collision Detection Reconstruction Input: sub sampled signal (by 2 10 and 3 6 ) Output: positions and complex values of active frequencies International Solid-State Circuits Conference 13 of 42

Sparse FFT Chip Block Diagram Memory 1 2 10 3 6 Memory 2 2 10 3 6 Memory Interface Memory 3 2 10 3 6 Subsamples of new signal 2 10 -point FFT Scheduler Update Recovered freq. of processed signal I/O interface 3 6 -point FFT Bucketization Estimation Collision Detection Reconstruction 3 Memory sets, 2 SRAM blocks each Rotated across the 3 pipeline stages International Solid-State Circuits Conference 14 of 42

Sparse FFT Chip Block Diagram Memory 1 2 10 3 6 Memory 2 2 10 3 6 Memory Interface Memory 3 2 10 3 6 Subsamples of new signal 2 10 -point FFT Scheduler Update Recovered freq. of processed signal I/O interface 3 6 -point FFT Bucketization Estimation In place FFT Radix 4 butterfly for 2 10 point FFT Radix 3 butterfly for 3 6 point FFT Three FFTs of each type sharing the memory Collision Detection Reconstruction International Solid-State Circuits Conference 15 of 42

2 10 FFT Architecture Three radix 4 butterflies running in parallel Two 12b x 12b real multipliers per butterfly Multiplexing Real and Imaginary International Solid-State Circuits Conference 16 of 42

2 10 FFT Architecture BF BF 1024 4 Butterflies Memory BF Memory BF Memory Memory BF Frame log 1024 frames In place 1024 point FFT using a radix 4 butterfly 1024 additions (10 extra bits to account for overflow) International Solid-State Circuits Conference 17 of 42

Dynamic Scaling Block Floating Point Writing to memory every 2 clock cycles Current Frame Exponent 12 Real Imag 12 Decision for next frame Tracking Overflow MSBs Current Frame Adjustment Reduce the required bit width Tracking large values to avoid unnecessary shifting >> >> 14 14 MUX MUX Real 1,2,3 & 4 Imag 1,2,3 & 4 International Solid-State Circuits Conference 18 of 42

Sparse FFT Chip Block Diagram Memory 1 2 10 3 6 Memory 2 2 10 3 6 Memory Interface Memory 3 2 10 3 6 Subsamples of new signal 2 10 -point FFT Scheduler Update Recovered freq. of processed signal I/O interface 3 6 -point FFT Bucketization Estimation Collision Detection Reconstruction Reconstruction stage has 4 sub blocks International Solid-State Circuits Conference 19 of 42

Sparse FFT Chip Block Diagram Memory 1 2 10 3 6 Memory 2 2 10 3 6 Memory Interface Memory 3 2 10 3 6 Subsamples of new signal 2 10 -point FFT Scheduler Update Recovered freq. of processed signal I/O interface 3 6 -point FFT Bucketization Estimation Collision Detection Reconstruction International Solid-State Circuits Conference 20 of 42

Estimation Sub block Estimate frequency position where 0 1 For 0.75 million Use 20 bits to represent International Solid-State Circuits Conference 21 of 42

Estimation Sub block Estimate frequency position where 0 1 For 0.75 million Use 20 bits to represent For 2 10 bucketization: frequency aliases to bucket number : 10 LSBs of can be estimated from bucket number 10 remaining MSBs are estimated using phase rotation International Solid-State Circuits Conference 22 of 42

Estimation Sub block Use phase rotation from a time shift to estimate the 10 MSBs of : 2 2 International Solid-State Circuits Conference 23 of 42

Estimation Sub block Use phase rotation from a time shift to estimate the 10 MSBs of : 2 2 If 1,phase wraps around 2 create same phase rotation loose highest MSBs Use time shift to get MSBs International Solid-State Circuits Conference 24 of 42

Estimation Sub block Use phase rotation from a time shift to estimate the 10 MSBs of : 2 If 1,phase wraps around 2 create same phase rotation loose highest MSBs Use time shift to get MSBs 2 Noise in creates error Use large time shift to average noise in International Solid-State Circuits Conference 25 of 42

Estimation Sub block Use phase rotation from a time shift to estimate the 10 MSBs of : 2 2 If 1,phase wraps around 2 create same phase rotation loose highest MSBs Use time shift to get MSBs Noise in creates error Use large time shift to average noise in Use two different time shifts and to estimate the 10 MSBs of International Solid-State Circuits Conference 26 of 42

Estimation Sub block Frequency Recovery from 2 10 bucketization : 20 bits 5 bits 5 bits 10 bits from shift 1 from shift 32 from bucket number International Solid-State Circuits Conference 27 of 42

Estimation Sub block Frequency Recovery from 2 10 bucketization : 20 bits 5 bits 5 bits 10 bits from shift 1 from shift 32 from bucket number Problem: Small error propagates Affect MSBs Solution: Use 3 bit overlap to detect errors 8 bits 8 bits 3 bits 3 bits 10 bits International Solid-State Circuits Conference 28 of 42

Estimation Sub block Estimation Complex values of 3 FFTs CORDIC Step 1 Frequency Recovery + + 7:3 2:0 7:5 4:0 Frequency Recovery + 12:8 1 7:0 Phase Rotation 2 Estimated Frequency International Solid-State Circuits Conference 29 of 42

Estimation Sub block Estimation Complex values of 3 FFTs CORDIC Step 2 Frequency Recovery 2 3 12:0 Phase Rotation 2 X Bucket number 12:3 2:0 9:7 6:0 Frequency Recovery Estimated Frequency + 1 19:10 9:0 20 bits International Solid-State Circuits Conference 30 of 42

Estimation Sub block For 3 6 bucketization: frequency aliases to bucket number : 6 LSBs cannot be replaced directly by the bucket number Naïve implementation: 1. Estimate from the phase rotation 2. Calculate the remainder mod 3 3. Subtract this remainder from 4. Then add the bucket number b 2 Steps 2 and 3 are done indirectly by truncating the LSBs of International Solid-State Circuits Conference 31 of 42

Estimation Sub block Estimation Complex values of 3 FFTs CORDIC Step 2 2 3 12:3 2:0 Phase Rotation + X 1 Frequency Recovery X 9:7 Bucket number 6:0 2 1 2 3 Frequency Recovery Estimated Frequency + 19:0 20 bits International Solid-State Circuits Conference 32 of 42

Sparse FFT Chip Block Diagram Memory 1 2 10 3 6 Memory 2 2 10 3 6 Memory Interface Memory 3 2 10 3 6 Subsamples of new signal 2 10 -point FFT Scheduler Update Recovered freq. of processed signal I/O interface 3 6 -point FFT Bucketization Estimation Collision Detection Reconstruction International Solid-State Circuits Conference 33 of 42

No Collision: Collision Detection Single active frequency Time shift causes only change in phase and not in magnitude of the bucket International Solid-State Circuits Conference 34 of 42

No Collision: Collision Detection Single active frequency Time shift causes only change in phase and not in magnitude of the bucket Collision: Multiple active frequencies sum up with different phase rotations Time shift causes the magnitude of the bucket to change International Solid-State Circuits Conference 35 of 42

Die photo Technology Core Area Core Supply Voltage 45nm SOI CMOS 0.6mm x 1.0mm 0.66 1.18 V Clock Frequency 0.5 1.5 GHz Core Power Latency 14.6 174.8 mw 63 21 μs International Solid-State Circuits Conference 36 of 42

Measurement Result International Solid-State Circuits Conference 37 of 42

Measurement Result (0.66 V, ) 48 sfft/ms 298 nj/sfft International Solid-State Circuits Conference 38 of 42

Measurement Result 88x Speed up over intel i7 (3.4GHz) (1.18 V,. ) 146 sfft/ms 1.2 µj/sfft International Solid-State Circuits Conference 39 of 42

Measurement Result At 0.66V: Effective throughput 36GS/s Consuming 0.4pJ/sample International Solid-State Circuits Conference 40 of 42

Comparison ISSCC 11 JSSC 08 JSSC 12 This work Technology 65 nm 90 nm 65nm 45 nm Signal Type Any Signal Any Signal Any Signal Freq. Sparse signal Size 2 10 2 8 2 7 to 2 11 3 6 x2 10 Word width 16 bits 10 bits 12 bits 12 bits Area 8.29mm 2 5.1mm 2 1.37mm 2 0.6mm 2 Effective 240MS/s 2.4GS/s 1.25 20MS/s 36 109GS/s Throughput Energy/Sample 17.2pJ 50pJ 19.5 50.6pJ 0.4 1.6pJ International Solid-State Circuits Conference 41 of 42

Comparison ISSCC 11 JSSC 08 JSSC 12 This work Technology 65 nm 90 nm 65nm 45 nm Signal Type Any Signal Any Signal Any Signal Freq. Sparse signal Size 2 10 2 8 2 7 to 2 11 3 6 x2 10 Word width 16 bits 10 bits 12 bits 12 bits Area 8.29mm 2 5.1mm 2 1.37mm 2 0.6mm 2 Effective 240MS/s 2.4GS/s 1.25 20MS/s 36 109GS/s Throughput Energy/Sample 17.2pJ 50pJ 19.5 50.6pJ 0.4 1.6pJ Sparse FFT works only for frequency sparse signals International Solid-State Circuits Conference 42 of 42

Comparison ISSCC 11 JSSC 08 JSSC 12 This work Technology 65 nm 90 nm 65nm 45 nm Signal Type Any Signal Any Signal Any Signal Freq. Sparse signal Size 2 10 2 8 2 7 to 2 11 3 6 x2 10 Word width 16 bits 10 bits 12 bits 12 bits Area 8.29mm 2 5.1mm 2 1.37mm 2 0.6mm 2 Effective 240MS/s 2.4GS/s 1.25 20MS/s 36 109GS/s Throughput Energy/Sample 17.2pJ 50pJ 19.5 50.6pJ 0.4 1.6pJ International Solid-State Circuits Conference 43 of 42

Comparison ISSCC 11 JSSC 08 JSSC 12 This work Technology 65 nm 90 nm 65nm 45 nm Signal Type Any Signal Any Signal Any Signal Freq. Sparse signal Size 2 10 2 8 2 7 to 2 11 3 6 x2 10 Word width 16 bits 10 bits 12 bits 12 bits Area 8.29mm 2 5.1mm 2 1.37mm 2 0.6mm 2 Effective 240MS/s 2.4GS/s 1.25 20MS/s 36 109GS/s Throughput Energy/Sample 17.2pJ 50pJ 19.5 50.6pJ 0.4 1.6pJ International Solid-State Circuits Conference 44 of 42

Comparison ISSCC 11 JSSC 08 JSSC 12 This work Technology 65 nm 90 nm 65nm 45 nm Signal Type Any Signal Any Signal Any Signal Freq. Sparse signal Size 2 10 2 8 2 7 to 2 11 3 6 x2 10 Word width 16 bits 10 bits 12 bits 12 bits Area 8.29mm 2 5.1mm 2 1.37mm 2 0.6mm 2 Effective 240MS/s 2.4GS/s 1.25 20MS/s 36 109GS/s Throughput Energy/Sample 17.2pJ 50pJ 19.5 50.6pJ 0.4 1.6pJ International Solid-State Circuits Conference 45 of 42

Conclusion 0.75 million point sparse FFT in 45nm CMOS IBM SOI More than 40x better energy efficiency than traditional FFTs 88x improvement in run time over the software implementation International Solid-State Circuits Conference 46 of 42

International Solid-State Circuits Conference 47 of 42