### mkl

```Introduction to Parallel Computing
Intel Math Kernel Library
Huan-Ting Yen, Department of Mathematics, National Taiwan University
2011/07/22
Parallel Computing
2
Introduction to Parallel Computing
2011/07/22
What is parallel computing?
Traditionally, software has been written for serial
computation:

3
Introduction to Parallel Computing
2011/07/22
What is parallel computing?
In the simplest sense, parallel computing is the
simultaneous use of multiple compute resources to solve
a computational problem:

4
Introduction to Parallel Computing
2011/07/22
Resource
The compute resource




5
A single computer with multiple processors;
An arbitrary number of computers connected by a network;
A combination of both.
Core 1
Core 2
Core
3
Core 4
Introduction to Parallel Computing
2011/07/22
Resource
The compute resource




A single computer with multiple processors;
An arbitrary number of computers connected by a network;
A combination of both.
several
core 1
6
several
core 2
several
core 3
Introduction to Parallel Computing
several
core 4
2011/07/22
Resource
The compute resource




7
A single computer with multiple processors;
An arbitrary number of computers connected by a network;
A combination of both.
Introduction to Parallel Computing
2011/07/22
Resource
The compute resource




8
A single computer with multiple processors;
An arbitrary number of computers connected by a network;
A combination of both.
Introduction to Parallel Computing
2011/07/22
Why use parallel computing?
The primary reasons for using parallel computing:




Save time – wall clock time
Solve larger problems
Provide concurrency (do many things at the same time)
Other reasons might include:




9
Cost savings
Overcoming memory constraints
Introduction to Parallel Computing
2011/07/22
Amdahl’s Law

Speedup of a parallel program is limited by amount of
serial works.
10
Introduction to Parallel Computing
2011/07/22
Amdahl’s Law

Speedup of a parallel program is limited by amount of
serial works.
11
Introduction to Parallel Computing
2011/07/22
Flynn’s Taxonomy

Classification for parallel computers and programs
12
Single Instruction
Multiple Instruction
Single Data
SISD
(single core CPU)
MISD
(very rare)
Multiple Data
SIMD
(GPU/vector processor)
MIMD
(multiple core CPU)
Introduction to Parallel Computing
2011/07/22
Flynn’s Taxonomy

Classification for parallel computers and programs
SISD
13
SIMD
Introduction to Parallel Computing
2011/07/22
Flynn’s Taxonomy

Classification for parallel computers and programs
MISD
14
MIMD
Introduction to Parallel Computing
2011/07/22
Intel Math Kernel Library
15
Introduction to Parallel Computing
2011/07/22
Overview


The Intel® Math Kernel Library (Intel® MKL) provides
Fortran routines and functions that perform a wide
variety of operations on vectors and matrices including
sparse matrices. The library also includes fast Fourier
transform (FFT) functions, as well as vector mathematical
and vector statistical functions with Fortran and C
interfaces.
The versions of Intel MKL intended for Windows* and
Linux* operating systems also include ScaLAPACK
software and Cluster FFT software for solving respective
computational problems on distributed-memory parallel
computers.
16
Intel MKL Quickstart
2011/07/22
Intel MKL: Intel Math Kernel Library

Functionality







17
BLAS and Sparse BLAS Routines
LAPACK Routines: Linear Equations
LAPACK Routines: Eigenvalue Problems
ScaLAPACK
Sparse Solver Routines
Fast Fourier Transforms
Cluster Fast Fourier Transforms
Intel MKL Quickstart
2011/07/22
System Requirements (Hardware)

Hardware:








Intel® Core™ processor family
Intel® Xeon® processor family
Intel® Pentium® 4 processor family
Intel® Pentium® lll processor
Intel® Pentium® processor (300 MHz or faster)
Intel® Celeron® processor
AMD Athlon* and Opteron* processors
How do you know that information about the CPUs ?

18
\$ cat /proc/cpuinfo
Intel MKL Quickstart
2011/07/22
System Requirements (Software)

Following is the list of supposed operating system:





How do you know that information about the operating
system?


Red Hat* Enterprise Linux* 3, 4, 5
Red Hat* Fedora* 9
Debian* GNU/Linux 4.0
Ubuntu* 8.04
\$ cat /etc/*release
Following is the list of supposed C/C++ and Fortran
compilers:



19
Intel® Fortran Compiler 10.1 for Linux*
Intel® C++ Compiler 10.1 for Linux*
GNU Compiler Collection (gcc, g77, gfortran 4.2.0)
Intel MKL Quickstart
2011/07/22
Installing Intel MKL on a Linux* System


20
Intel MKL Quickstart
2011/07/22
Installing Intel MKL on a Linux* System
21
Intel MKL Quickstart
2011/07/22
Installing Intel MKL on a Linux* System
22
Intel MKL Quickstart
2011/07/22
Installing Intel MKL on a Linux* System
23
Intel MKL Quickstart
2011/07/22
Installing Intel MKL on a Linux* System
24
Intel MKL Quickstart
2011/07/22
Installing Intel MKL on a Linux* System
25
Intel MKL Quickstart
2011/07/22
Installing Intel MKL on a Linux* System
26
Intel MKL Quickstart
2011/07/22
Installing Intel MKL on a Linux* System

[email protected]:~/software\$ wget “URL”

[email protected]:~/software\$ ll

\$ tar –zxvf l_mkl_p_10.2.x.yyy.tar.gz
27
Intel MKL Quickstart
2011/07/22
Installing Intel MKL on a Linux* System


28
cd l_mkl_p_10.2.x.yyy
./install.sh
Intel MKL Quickstart
2011/07/22
Installing Intel MKL on a Linux* System
29
Intel MKL Quickstart
2011/07/22
Installing Intel MKL on a Linux* System
30
Intel MKL Quickstart
2011/07/22
Some Examples
Intel MKL Quickstart
31
Example

Brief examples to







32
BLAS Level 1 Routines (vector-vector operations)
BLAS Level 2 Routines (matrix-vector operations)
BLAS Level 3 Routines (matrix-matrix operations)
Compute the LU factorization of a matrix (LAPACK)
Solve linear system (LAPACK)
Solve eigen system (LAPACK)
Fast Fourier Transforms
Intel MKL Quickstart
2011/07/22
Example

Brief examples to







33
BLAS Level 1 Routines (vector-vector operations)
BLAS Level 2 Routines (matrix-vector operations)
BLAS Level 3 Routines (matrix-matrix operations)
Compute the LU factorization of a matrix (LAPACK)
Solve linear system (LAPACK)
Solve eigen system (LAPACK)
Fast Fourier Transforms
Intel MKL Quickstart
2011/07/22
Ex1. The complex dot product ( res = å (conjg(x)* y) )
#include <stdio.h>
#include "mkl_blas.h”
#define N 5
typedef struct{
double re;
double im;
}mkl_complex;
int main()
{
int n, incx = 1, incy = 1, i;
mkl_complex x[N], y[N], res;
void zdotc();
n = N;
for( i = 0; i < n; i++ ){
x[i].re = (double)i; x[i].im = (double)i * 2.0;
y[i].re = (double)(n - i); y[i].im = (double)i * 2.0;
}
zdotc( &res, &n, x, &incx, y, &incy );
printf( “The complex dot product is: ( %6.2f, %6.2f )\n", res.re, res.im );
return 0;
}
34
Intel MKL Quickstart
2011/07/22
?dotc


Computes a dot product of a conjugate vector with another
vector.
Description : The routine is declared in
 Fortran77 : mkl_blas.fi




Fortran95 : blas.f90
C : mkl_blas.h
Input Parameters ( zdotc(&res,&n,x,&incx,y,&incy) )
 n: The length of two vectors.
 incx: Specifies the increment for the elements of x
 incy: Specifies the increment for the elements of y
output Parameters ( zdotc(&res,&n,x,&inca,y,&incb) )
 res: The final result
35
Intel MKL Quickstart
2011/07/22
Makefile (Sequential)
Test : blas_c
CC = icc
MKL_HOME = /home/opt/intel/mkl/10.2.2.025
MKL_INCLUDE = \$(MKL_HOME)/include
MKL_PATH = \$(MKL_HOME)/lib/em64t
EXE = blas_c.exe
blas_c:
\$(CC) -o \$(EXE) blas_c.c -I\$(MKL_INCLUDE) -L\$(MKL_PATH)
36
Intel MKL Quickstart
2011/07/22
Makefile (Parallel)
Test = blas_c
CC = icc
MKL_HOME = /home/opt/intel/mkl/10.2.2.025
MKL_INCLUDE = \$(MKL_HOME)/include
MKL_PATH = \$(MKL_HOME)/lib/em64t
EXE = blas_c.exe
blas_c:
\$(CC) -o \$(EXE) blas_c.c -I\$(MKL_INCLUDE) -L\$(MKL_PATH)
-Wl,--start-group -lmkl_intel_lp64 -lmkl_core
37
Intel MKL Quickstart
2011/07/22
?dotc


Computes a dot product of a conjugate vector with another
vector.
Description : The routine is declared in
 Fortran77 : mkl_blas.fi




Fortran95 : blas.f90
C : mkl_blas.h
Input Parameters ( zdotc(&res,&n,x,&inca,y,&incb) )
 n: The length of two vectors.
 incx: Specifies the increment for the elements of x
 incy: Specifies the increment for the elements of y
output Parameters ( zdotc(&res,&n,x,&inca,y,&incb) )
 res: The final result
38
Intel MKL Quickstart
2011/07/22
BLAS Routines

Routines Naming Conventions

BLASB routine names have the following structure:
<character> <name> <mode> ()

The <character> filed indicates the data type:
s
c
d
z

The <mode> filed indicates the data type:
c
u
g
39
real, single precision
complex, single precision
real, double precision
complex, double precision
conjugated vector
unconjugated vector
Givens rotation.
Intel MKL Quickstart
2011/07/22
BLAS Routines

Routines Naming Conventions

BLASB routine names have the following structure:
<character> <name> <mode> ()

In BLAS level 2 and 3, <name> filed indicates the matrix type:
ge
gb
sy
sb
he
hb
tr
tb
40
general matrix
general band matrix
symmetric matrix
symmetric band matrix
Hermitian matrix
Hermitian band matrix
triangular matrix
triangular band matrix
Intel MKL Quickstart
2011/07/22
BLAS Level 1 Routines
Routine
Data Type
Description
?asum
s, d, sc, dz
Sum of vector magnitudes
?axpy
s, d, c, z
Scalar-vector product
?copy
s, d, c, z
Copy vector
?dot
s, d
Doc product
?dotc
c, z
Doc conjugated
?nrm2
s, d, sc, dz
Vector 2-norm (Euclidean norm)
?rotg
s, d, cs, zd
Givens rotation of points
?rot
s, d, cs, zd
Plane rotation of points
?scal
s, d, c, z,
cs, zd
Vector-scalar product
?swap
s, d, c, z
Vector-vector swap
i?max
s, d, c, z
Index of the maximum absolute value element of
a vector
41
Intel MKL Quickstart
2011/07/22
Example

Brief examples to







42
BLAS Level 1 Routines (vector-vector operations)
BLAS Level 2 Routines (matrix-vector operations)
BLAS Level 3 Routines (matrix-matrix operations)
Compute the LU factorization of a matrix (LAPACK)
Solve linear system (LAPACK)
Solve eigen system (LAPACK)
Fast Fourier Transforms
Intel MKL Quickstart
2011/07/22
Ex2-1. Matrix-vector product ( y = a Ax + b y)
#include "mkl_blas.h”
int main()
{
int
m, n, incx, incy, lda, idxi, idxj;
double alpha, beta, *x, *y, *A ;
char
trans;
m
n
incx
incy
lda
alpha
beta
trans
=
=
=
=
=
=
=
=
3;
3;
1;
1;
m;
1.0;
1.0;
'n’;
x = (double*)malloc(sizeof(double)*n);
y = (double*)malloc(sizeof(double)*n);
A = (double*)malloc(sizeof(double)*m*n);
43
Intel MKL Quickstart
2011/07/22
Ex2-2. Matrix-vector product ( y = a Ax + b y)
for( idxi = 0; idxi < n; idxi++ ){
*(x+idxi) = 1.0;
*(y+idxi) = 1.0;
}
for( idxi = 0; idxi < m; idxi++ )
for( idxj = 0; idxj < n; idxj++)
*(A+idxi*m+idxj) = (double)(idxi+1) + idxj;
dgemv(&trans, &m, &n, &alpha, A, &lda, x, &incx, &beta, y, &incy);
return 0;
}
44
Intel MKL Quickstart
2011/07/22
?gemv



Computes a matrix-vector product using a general matrix.
Description : The routine is declared in
 Fortran77 : mkl_blas.fi
 Fortran95 : blas.f90
 C : mkl_blas.h
Input Parameters
dgemv(&trans,&m,&n,&alpha,A,&lda,x,&incx,&beta,y,&incy)


45
trans: if trans = ‘N’, ‘n’, then ( y = a Ax + b y)
T
if trans = ‘T’, ‘t’, then ( y = a A x + b y)
= a A* x + b y)
if trans = ‘C’, ‘c’,( ythen
m: The number of rows of the matrix A .
Intel MKL Quickstart
2011/07/22
?gemv


Input Parameters
 n: The number of columns of the matrix A
 lda: The first dimension of matrix, lda = max(1,m)
 incx: Specifies the increment for the elements of x
 incy: Specifies the increment for the elements of y
output Parameters
 y: Updated vector y.
46
Intel MKL Quickstart
2011/07/22
Ex2. Result
47
Introduction to MATLAB
Vectors and Planes
BLAS Level 2 Routines
Routine
Data Type
Description
?gemv
s, d, c, z
Matrix-vector product using a general matrix
?gbmv
s, d, c, z
Matrix-vector product using a general band matrix
?symv
s, d
Matrix-vector product using a symmetric matrix
?sbmv
s, d
Matrix-vector product using a symmetric band
matrix
?hemv
c, z
Matrix-vector product using a Hermitian matrix
?hbmv
c, z
Matrix-vector product using a Hermitian band
matrix
?trmv
c, z
Matrix-vector product using a triangular matrix
?tbmv
s, d, sc, dz
Matrix-vector product using a triangular band
matrix
48
Intel MKL Quickstart
2011/07/22
Example

Brief examples to







49
BLAS Level 1 Routines (vector-vector operations)
BLAS Level 2 Routines (matrix-vector operations)
BLAS Level 3 Routines (matrix-matrix operations)
Compute the LU factorization of a matrix (LAPACK)
Solve linear system (LAPACK)
Solve eigen system (LAPACK)
Fast Fourier Transforms
Intel MKL Quickstart
2011/07/22
Ex3-1. Matrix-Matrix product (C = a AB + bC)
#include "mkl_blas.h”
int main()
{
int
m, n, k, lda, ldb, ldc, idxi, idxj;
double alpha, beta, *A, *B, *C ;
char
transa, transb;
m
n
k
lda
ldb
ldc
alpha
beta
transa
transb
50
=
=
=
=
=
=
=
=
=
=
3;
3;
3;
m;
k;
m;
1.0;
1.0;
'n’;
'n’;
Intel MKL Quickstart
2011/07/22
Ex3-2. Matrix-vector product ( y = a Ax + b y)
A = (double*)malloc(sizeof(double)*m*n);
B = (double*)malloc(sizeof(double)*m*n);
C = (double*)malloc(sizeof(double)*m*n);
for( idxi = 0; idxi < m; idxi++ )
for( idxj = 0; idxj < n; idxj++)
{
*(A+idxi*m+idxj) = (double)(idxi+1) + idxj;
*(B+idxi*m+idxj) = (double)(idxi+1) + idxj;
*(C+idxi*m+idxj) = (double)(idxi+1) + idxj;
}
dgemm(&transa, &transb, &m, &n, &k,
&alpha, A, &lda, B, &ldb, &beta, C, &ldc);
return 0;
}
51
Intel MKL Quickstart
2011/07/22
?gemm


Input Parameters
 k: The number of columns of the matrix A and the number
of rows of the matrix .
B
 lda: When transa=‘N’ or ‘n’, then lda =
max(1,m),otherwise lda=max(1,k).
 ldb: When transa=‘N’ or ‘n’, then ldb =
max(1,k),otherwise lda=max(1,n).
 ldc: The first dimension of matrix, ldc = max(1,m)
output Parameters
 C: Overwritten by m-by-n matrix.
52
Intel MKL Quickstart
2011/07/22
Ex3. Result
53
Introduction to MATLAB
Vectors and Planes
BLAS Level 3 Routines
Routine
Data Type
Description
?gemm
s, d, c, z
Matrix-matrix product of general matrices
?hemv
c, z
Matrix-matrix product of Hermitian matrices
?symm
s, d, c, z
Matrix-matrix product of symmetric matrices
?trmm
s, d, sc, dz
Matrix-matrix product of triangular matrices
54
Intel MKL Quickstart
2011/07/22
Example

Brief examples to







55
BLAS Level 1 Routines (vector-vector operations)
BLAS Level 2 Routines (matrix-vector operations)
BLAS Level 3 Routines (matrix-matrix operations)
Compute the LU factorization of a matrix (LAPACK)
Solve linear system (LAPACK)
Solve eigen system (LAPACK)
Fast Fourier Transforms
Intel MKL Quickstart
2011/07/22
Ex4. LU Factorization (A = P * L *U)
#include "mkl_lapack.h”
int main()
{
int
m, n, lda, info, idxi, idxj, *ipiv;
double *A;
m
n
lda
= 3;
= 3;
= m;
ipiv = (int*)malloc(sizeof(int)*m);
A
= (double*)malloc(sizeof(double)*m*n);
*(A+0)=1;
*(A+1)=2; *(A+2)=6;
*(A+3)=-2; *(A+4)=3; *(A+5)=5;
*(A+6)=4;
*(A+7)=8; *(A+8)=1;
dgetrf(&m, &n, A, &lda ,ipiv, &info);
return 0;
}
56
Intel MKL Quickstart
2011/07/22
?getrf

Description : The routine is declared in
 Fortran77 : mkl_lapack.fi



Fortran95 : lapack.f90
C : mkl_lapack.h
Input Parameters
 m: The number of columns of the matrix A.
 n: The number of rows of the matrix A .
 lda: The first dimension of matrix A .
 A: Array,
REAL for sgetrf
DOUBLE PRECISION for dgetrf
COMPLEX for cgetrf
DOUBLE COMPLEX for zgetrf.
57
Intel MKL Quickstart
2011/07/22
?getrf

output Parameters
 A: Overwritten by L and U. The unit diagonal
A elements of L
are not stored.
 ipiv: An integer array, dimension at least
max(1,min(m,n)). The pivot indices; row i is
interchanged with row
ipiv(i)
 info: Integer. If info=0,the execution is successful.
If info=-i,the
uii = 0.i-th parameter had an illegal value.
If info=i,
The factorization has been completed,
but U is singular.
58
Intel MKL Quickstart
2011/07/22
Ex4-1. Result
59
Introduction to MATLAB
Vectors and Planes
Ex4-2. Result
60
Introduction to MATLAB
Vectors and Planes
LAPACK Computational Routines
general
matrix
sysmmetric
indefinite
sysmmetric
positivedefinite
Factorize matrix
?getrf
?sytrf
?potrf
Solve linear system
with a factored
matrix
?getrs
?sytrs
?potrs
?trtrs
Condition number
?gecon
?sycon
?pocon
?trcon
Compute the
inverse matrix using
the factorization
?getri
?sytri
?potri
?trtri
61
Intel MKL Quickstart
triangular
matrix
2011/07/22
LAPACK Routines: Linear Equations
To solve a particular problem, you can call two or more
computational routines or call a corresponding driver routines
that combines several tasks in one call.
For example, to solve a system of linear equation with a general
matrix, call ?getrf (LU factorization) and then ?getrs
(computing the solution). Alternatively, use the driver routine
?gesv that performs all these tasks in one call.
62
Intel MKL Quickstart
2011/07/22
Example

Brief examples to







63
BLAS Level 1 Routines (vector-vector operations)
BLAS Level 2 Routines (matrix-vector operations)
BLAS Level 3 Routines (matrix-matrix operations)
Compute the LU factorization of a matrix (LAPACK)
Solve linear system (LAPACK)
Solve eigen system (LAPACK)
Fast Fourier Transforms
Intel MKL Quickstart
2011/07/22
Ex5-1. Solve the Linear Eqation (Ax = b)
#include <stdio.h>
#include "mkl_lapack.h”
int main()
{
int
n, nrhs, lda, ldb, info, idxi, idxj, *ipiv;
double *A, *b;
n
nrhs
lda
ldb
=
=
=
=
3;
1;
n;
n;
ipiv = (int*)malloc(sizeof(int)*n);
A
= (double*)malloc(sizeof(double)*n*n);
b
= (double*)malloc(sizeof(double)*n);
for( idxi = 0; idxi < n; idxi++ )
for( idxj = 0; idxj < n; idxj++)
*(A+idxi*n+idxj) = (double)(idxi+1) + idxj;
64
Intel MKL Quickstart
2011/07/22
Ex5. Solve the Linear Eqation (Ax = b)
*(b+0) = 6;
*(b+1) = 9;
*(b+2) = 12;
dgesv(&n, &nrhs, A, &lda ,ipiv, b, &ldb, &info);
return 0;
}
65
Intel MKL Quickstart
2011/07/22
?gesv


Input Parameters
 nrhs: The number of columns of the matrix b .
Output Parameters
 A: Overwritten by the factor L and U from the factorization
of A = P * L *U.
 b: Overwritten by the solution matrix x .
66
Intel MKL Quickstart
2011/07/22
Ex5. Result
67
Introduction to MATLAB
Vectors and Planes
Example

Brief examples to







68
BLAS Level 1 Routines (vector-vector operations)
BLAS Level 2 Routines (matrix-vector operations)
BLAS Level 3 Routines (matrix-matrix operations)
Compute the LU factorization of a matrix (LAPACK)
Solve linear system (LAPACK)
Solve eigen system (LAPACK)
Fast Fourier Transforms
Intel MKL Quickstart
2011/07/22
Ex6-1. Solve the Eigen Eqation (Ax = l x)
#include "mkl_lapack.h”
int main()
{
int
n, lda, lwork, ldvl, ldvr, info, idxi, idxj;
double *wr, *wi, *A, *work, *vl, *vr;
char
jobvl, jobvr;
n
lda
ldvl
ldvr
lwork
jobvl
jobvr
A
wr
wi
vl
vr
work
69
=
=
=
=
=
=
=
=
=
=
=
=
=
3;
n;
1;
n;
4*n; // not 3*n
‘N’;
‘V’;
(double*)malloc(sizeof(double)*n*n);
(double*)malloc(sizeof(double)*n);
(double*)malloc(sizeof(double)*n);
(double*)malloc(sizeof(double)*ldvl*n);
(double*)malloc(sizeof(double)*ldvr*n);
(double*)malloc(sizeof(double)*lwork);
Intel MKL Quickstart
2011/07/22
Ex6-2. Solve the Eigen Eqation (Ax = l x)
*(A+0)
*(A+1)
*(A+2)
*(A+3)
*(A+4)
*(A+5)
*(A+6)
*(A+7)
*(A+8)
=
=
=
=
=
=
=
=
=
2;
-1;
0;
-1;
2;
-1;
0;
-1;
2;
dgeev(&jobvl, &jobvr, &n, A, &lda, &wr, &wi,
vl, &ldvl, vr, &ldvr, work, &lwork, &info);
return 0;
}
70
Intel MKL Quickstart
2011/07/22
?geev

Input Parameters
 jobvl: If jobvl=‘N’, the left eigenvalues of A are not
computed.
If jobvl=‘V’, the left eigenvalues of A are computed.
 jobvr: If jobvr=‘N’, the right eigenvalues of A are not
computed.
If jobvr=‘V’, the right eigenvalues of A are computed.
 work: A workspace array, its dimension max(1, lwork).
 lwork: The dimension of the array work.
lwork ≥ max(1,3n), lwork < max(1,4n)(for real).
 ldvl, ldvr: The leading dimension of the output array
vl and vr, respectively.
71
Intel MKL Quickstart
2011/07/22
?geev

Output Parameters
 wr, wi: Contain the real and imaginary parts, respectively, of the
computed eigenvalue.
 vl, vr: If jobvl = ‘V’, the left eigenvectors u(j) are
stored one after another in the columns of vl, in the same order
as their eigenvalues.
If jobvl = ‘N’, vl is not referenced.
If the j-th eigenvalue is real, then u(j) = vl(:,j), the j-th
column of vl.
 info: info=0, the execution is successful.
info=-i, the i-th parameter had an illegal value.
info= i, then the QR algorithm failed to compute all the
eigenvalues, and no eigenvector have been computed.
72
Intel MKL Quickstart
2011/07/22
Ex6. Result
73
Introduction to MATLAB
Vectors and Planes
LAPACK Computational Routines







Orthogonal Factorizations (QR, QZ)
Singular Value Decomposition
Symmetric Eigenvalue Problems
Generalized Symmetric-Definite Eigenvalue Problems
Nonsymmetric Eigenvalue Problems
Generalized Nonsymmetric Eigenvalue Problems
Generalized Singular Value Decomposition
74
Intel MKL Quickstart
2011/07/22
LAPACK Driver Routines







Linear Least Squares (LLS) Problems
Generalized LLS Problems
Symmetric Eigenproblems
Nonsymmetric Eigenproblems
Singular Value Decomposition
Generalized Symmetric Definite Eigenproblems
Generalized Nonsymmetric Eigenproblems
75
Intel MKL Quickstart
2011/07/22
Example

Brief examples to







76
BLAS Level 1 Routines (vector-vector operations)
BLAS Level 2 Routines (matrix-vector operations)
BLAS Level 3 Routines (matrix-matrix operations)
Compute the LU factorization of a matrix (LAPACK)
Solve linear system (LAPACK)
Solve eigen system (LAPACK)
Fast Fourier Transforms
Intel MKL Quickstart
2011/07/22
Five Stage Usage Model for Computing FFT





Allocate a fresh descriptor for the problem with a call to the
DftiCreateDescriptor function. (precision, rank, sizes,
scaling factor, …)
Optionally adjust the descriptor configuration with a call to the
DftiSetValue function.
Commit the descriptor with a call to the
DftiCommitDescriptor function.
Compute the transform with a call to the
DftiComputeForward/DftiComputeBackward
function.
Deallocate the descriptor with a call to the
DftiFreeDescriptor function.
77
Intel MKL Quickstart
2011/07/22
Ex7-1. Three-Dimensional Complex FFT
#include "mkl_dfti.h”
#define m 1000
#define n 1000
#define k 1000
typedef struct
{
double re;
double im;
} mkl_complex;
int main()
{
int
double
MKL_LONG
idxi, idxj, idxk;
backward_scale;
status, length[3];
mkl_complex *vec_src, *vec_tmp, *vec_dst;
DFTI_DESCRIPTOR_HANDLE handle = 0;
78
Intel MKL Quickstart
2011/07/22
Ex7-2. Three-Dimensional Complex FFT
x_src = (mkl_complex*)malloc(sizeof(mkl_complex)*m*n*k);
x_tmp = (mkl_complex*)malloc(sizeof(mkl_complex)*m*n*k);
x_dst = (mkl_complex*)malloc(sizeof(mkl_complex)*m*n*k);
length[0] = m;
length[1] = n;
length[2] = k;
memset(x_src, 0, sizeof(sizeof(mkl_complex)*m*n*k));
memset(x_tmp, 0, sizeof(sizeof(mkl_complex)*m*n*k));
memset(x_dst, 0, sizeof(sizeof(mkl_complex)*m*n*k));
for(idxk=0; idxk<k; idxk++)
for(idxj=0; idxj<n; idxj++)
for(idxi=0; idxi<m; idxi++)
{
(x_src+idxk*k*n+idxj*n+idxi)->re=1.0;
(x_src+idxk*k*n+idxj*n+idxi)->im=0.0;
}
79
Intel MKL Quickstart
2011/07/22
Ex7-3. Three-Dimensional Complex FFT
status = DftiCreateDescriptor( &handle, DFTI_DOUBLE,
DFTI_COMPLEX, 3, length );
if(status && !DftiErrorClass(status, DFTI_NO_ERROR))
{
printf("Error : %s\n", DftiErrorMessage(status));
printf("TEST FAILED : DftiCreatDescriptor(&hand, ...)\n");
}
status = DftiSetValue( handle, DFTI_PLACEMENT, DFTI_NOT_INPLACE );
status = DftiCommitDescriptor( handle );
status = DftiComputeForward( handle, vec_src, vec_tmp );
backward_scale = 1.0/((double)m*n*k);
status = DftiSetValue( handle, DFTI_BACKWARD_SCALE, backward_scale );
status = DftiCommitDescriptor( handle );
status = DftiComputeBackward( handle, vec_tmp, vec_dst);
status = DftiFreeDescriptor( &handle );
return 0;
}
80
Intel MKL Quickstart
2011/07/22
FFT Functions
Function Name
Operation
DftiCreateDescriptor
Allocates memory for the descriptor data structure
and preliminarily initializes it.
DftiCommitDescriptor
Performs all initialization for the actual FFT
computation.
DftiCopyDescriptor
Copies an existing descriptor.
DftiFreeDescriptor
Frees memory allocated for a descriptor.
DftiComputeForward
Computes the forward FFT.
DftiComputeBackward
Computes the backward FFT.
DftiSetValue
Sets one particular configuration parameter with the
specified configuration value.
DftiGetValue
Gets the value of one particular configuration
parameter.
81
Intel MKL Quickstart
2011/07/22
Reference



Web site form LLNL tutorials
(https://computing.llnl.gov/tutorials/parallel_comp/)
Intel® Math Kernel Library Reference Manual (mklman.pdf)
Intel® Math Kernel Library for the Linux OS User’s Guide
(userguide.pdf)
Intel MKL Quickstart
Reference
82
83
Introduction to MATLAB
Vectors and Planes
```