### Mid Semester Presentation - High Speed Digital Systems Lab

```Technion - Israel institute of technology
department of Electrical Engineering
High speed digital systems laboratory
High-Throughput FFT
Student : Andrey Kuyel
Supervised by Mony Orbach
Spring 2011
Midterm Presentation (One semestrial project)
Project goals
• The project goal is to design and implement on
FPGA device FFT that capable to deal with data
transmitted at the rates up to 10Ms/sec*.
• The design will be written on VHDL
• The project has aspects of: signal processing
and logic design and high rate data processing.
*- 5Ms/sec for each of I and Q components .
FFT - Theoretic overview
The DFT (N- length vector) definition is:
N 1
X  k    x  nWN nk ;0  k  N  1
n 0
The time-complexity of the DFT is:
,WNk
 2 k 
exp i

 N 
 N 2 
The FFT algorithm (developed at first by J.W. Cooley and John Tukey at 1965) comes
to reduce the time-complexity of DFT into   N log N 
This algorithm called: "The Cooley–Tukey radix-2 FFT algorithm".
It is one of the most common FFT algorithms.
FFT radix-2 - Theoretic overview
The Cooley–Tukey radix-2 FFT algorithm
The idea of algorithm is to compute in two parts:
1. Calculation two series
Y k 
Y k , Z k 
N
1
2
 x nW
n 0
even
 nk
N
2
, Z k 
While:
N
1
2
 x nW
n 0
odd
 nk
N
2
: The first series is using all of the even components of the input vector -
x  2n , n  0,1, 2...
 N  1
x  2n  1 , n  1, 2...
2
N
2
, and the second uses the odd components This calculation takes
 log2 N 
FFT radix-2 - Theoretic overview
Calculation of the twiddles:
WNk
 2 k 
exp i

 N 
The FFT (N- length vector) definition is:
N 
X  k   Y  k   Z  k WN k ;0  k    1 ;
2

N

N 
X  k    Y  k   Z  k WN k ;0  k    1 ;
2

2

FFT radix-2 - Theoretic overview
The FFT schematic Radix-2 diagram (for N=2) – called butterfly unit:
4 real multipliers
16*16 bit
The butterfly unit is used as a sub-unit in any Radix-2 FFT-N-size unit.
The "twiddle factor" is the sine/cosine imaginary/real factor:
WNk
 2 k 
exp i

 N 
The FFT (N=8) radix 2 data flow
C language program for creating FFT data flow
for N points FFT
FFT core features
FFT core will have the following features:
• Real and imaginary Inputs: 8 bits width each (for each of N point).
• Real and imaginary outputs: 16 bits width each, where 8 MSB bits for integer part and 8
LSB bits for fractional part (for each N point).
• Drop-in module for Virtex-6 (xc6vlx240T)
• Forward complex FFT
• Transform sizes N = 16,32,(possibly 64 and 128 )
• Arithmetic type: Fixed-point
• Truncation after the butterfly
• Block RAM or/and Distributed RAM for data storage
• Bit/digit reversed or natural output order
• Input data at frequency 10 Gs/sec (total rate for real and image part of data )
Parallel N points radix 2 FFT – Block diagram
clock
Synchronization signals
Controls signals
nput data (real).
N points each of 8 bits width
Input data (imag).
N points each of 8 bits width
Output data (real).
N points each of 16 bits width
Parallel N points radix 2 FFT
Input data (imag).
N points each of 16 bits width
The FFT (N=16) radix 2 data flow – pipeline stages separation
FFT verification and performance measurement
The FFT core will be tested on Virtex-6 FPGA (Ml605 based on xc6vlx240T)
Test module
Virtex-6 FPGA
[email protected]/* <![CDATA[ */!function(t,e,r,n,c,a,p){try{t=document.currentScript||function(){for(t=document.getElementsByTagName('script'),e=t.length;e--;)if(t[e].getAttribute('data-cfhash'))return t[e]}();if(t&&(c=t.previousSibling)){p=t.parentNode;if(a=c.getAttribute('data-cfemail')){for(e='',r='0x'+a.substr(0,2)|0,n=2;a.length-n;n+=2)e+='%'+('0'+('0x'+a.substr(n,2)^r).toString(16)).slice(-2);p.replaceChild(document.createTextNode(decodeURIComponent(e)),c)}p.removeChild(t)}}catch(u){}}()/* ]]> *//sec*
Data in
Controls in
Data out
FFT core
Output signals
Controls out
Stimulus
Memory
MEM [email protected]/* <![CDATA[ */!function(t,e,r,n,c,a,p){try{t=document.currentScript||function(){for(t=document.getElementsByTagName('script'),e=t.length;e--;)if(t[e].getAttribute('data-cfhash'))return t[e]}();if(t&&(c=t.previousSibling)){p=t.parentNode;if(a=c.getAttribute('data-cfemail')){for(e='',r='0x'+a.substr(0,2)|0,n=2;a.length-n;n+=2)e+='%'+('0'+('0x'+a.substr(n,2)^r).toString(16)).slice(-2);p.replaceChild(document.createTextNode(decodeURIComponent(e)),c)}p.removeChild(t)}}catch(u){}}()/* ]]> *//sec
Currently implementation will be made on Virtex-6 FPGA Family
(xc6vlx240T), the devise contains 768 DSP slices, distributed RAM 3,650
Kb and block RAM of the size 14,976 Kb.
See attached specification table at the end of document.
*- Primary verification will be done on the reference data from FPGA
internal memory that will emulate data sampled at the rate 10Ms/sec
Post synthesis report
• Maximum operational frequency up to 350MHz (5.6
Gs/sec per component).
•
The design have:
 8*4*4=128 real (16*16 bits) multipliers. Where one 16*16 bits
multiplier take 280 6LUTs
 128 (16 bits) adders/subtractors each of the takes 16 of 6LUTS
Simulation and verification
• C program for generating radix 2 data flow
• Radix 2 FFT N points parallel C language model
for verification VHDL implementation of
design.
Project Schedule
Weeks
1-2
 Done
3
DFT , FFT Background studies
 Done
4
Studding different FFT algorithms and
implimintations
 Done
5
Studding hardware available for the project
implementation
 Done
6
Specification presentation, defining interface
signals.
 Done
7
Defining FFT core architecture. (Composing block
diagram that take advantage of hardware that FFT
core will be tested on )
 Done
8
Defining FSM (controllers) of the design and
implementing them on VHDL. (Testing their
functionality)
In progress
9
Writing VHDL code for data path adapted for Virtex6 architecture. (Virtex-6 DSP slices and memory
available for the core)
Partial
Project Schedule (continued…)
Weeks
10
Continuing task from week 9.
Writing VHDL code for data path adapted for Virtex6 architecture. (Virtex-6 DSP slices and memory
available for the core)
In progress
11
Midterm presentation
 Done
12-15
Finishing VHDL implementation and beginning of
verification
16-17
Writing final presentation and project book
```