Dr Rebecca Hartman-Baker, iVEC Supercomputing Development

Report
ivec.org
Team Quokka: Australia’s First
Foray into the Student Cluster
Challenge
Rebecca Hartman-Baker, Ph.D.
Senior Supercomputing Specialist & SCC Team Coach
[email protected]
Happy Cheeks, Ph.D.
Supercomputing Quokka & Team Mascot
http://www.facebook.com/happycheeksquokka
ivec.org
Outline
I.
II.
III.
IV.
V.
Student Cluster Competition
Building & Training SCC Team
Cluster Design
Science Applications
Competition
ivec.org
I. STUDENT CLUSTER
COMPETITION
iVEC Student Cluster Competition Training Team, 2013
ivec.org
Student Cluster Competition:
Introduction
• 48-hour non-stop computing showdown held
at annual supercomputing conference
• Held in Denver, Colorado, USA in 2013
• Teams of undergraduates design, build, run
apps on cluster
• Power constraint ~3000W
(standard track)
• $2500 equipment constraint
(commodity track)
ivec.org
SCC: Introduction
• Key cluster architecture rules:
• Machine must contain only publicly available
components
• All components must be turned on at all times
(low-watt idle okay)
• When running applications, must not exceed 26 A
@ 120 V power draw (~3120 W)
ivec.org
SCC: Introduction
• Applications: HPCC + 3 known apps, 1
mystery app
• 2013 apps: Graphlab, WRF, Nemo5, + Flying
Snakes (OpenFOAM)
• Teams judged on throughput of work
provided by judges, plus interview to
determine depth of understanding
ivec.org
SCC: History
• First held at 2007 Supercomputing
conference, SC07
• Brainchild of Brent Gorda, now
General Manager, Intel HighPerformance Data Division
(formerly Whamcloud)
• Now three Student Cluster
Competitions/year:
• China (April)
• ISC (June)
• SC (November)
ivec.org
SCC: iVEC’s Motivation
• Increase computational science literacy in
WA
• Develop future users/employees
• Train professional workforce for local industry
• Exposure for iVEC in international HPC
community
• It sounded like fun!
ivec.org
II. BUILDING & TRAINING THE
TEAM
“Compiling,” http://xkcd.com/303/
ivec.org
Starting iVEC’s SCC Team
• Began by raising interest at iVEC partner
universities
• Contacted iVEC directors at universities, got leads
for whom to contact
• First interest came from ECU (3 students)
• Other unis followed
ivec.org
Sponsorship for SCC Team
• SGI
• NVIDIA
• Allinea
• Rio Tinto
ivec.org
Sponsorship for SCC Team
• Discussed hardware sponsorship with Cray &
SGI
• SGI first to commit, hardware + travel money
• Solicited financial sponsorship from mining
companies in WA
• Rio Tinto committed to sponsor 3 students
• Obtained software & hardware sponsorship
from Allinea & NVIDIA
ivec.org
Team Hardware Sponsorship
• Most important sponsorship to get
• SGI very enthusiastic about sponsoring team
• Put best person in Asia-Pacific region on
project
• Todd helped team:
• Select machine architecture
• Determine software stack
• Set the machine up in Perth & at competition
ivec.org
Team Hardware Sponsorship
• When team decided to use GPUs in cluster,
NVIDIA loaned us 8 K20X GPUs
• Received free of charge through academic
program (had to return after competition )
ivec.org
Team Travel Sponsorship
•
•
•
•
Travel to competition very expensive
Budget: $3000/student
SGI committed enough for half of team
Solicited support from mining companies in
WA, successful with Rio Tinto
ivec.org
Team Software Sponsorship
• I “won” license for Allinea software
• I asked Allinea to sponsor license for team
instead
• Allinea provided license for MAP & DDT
products
• MAP: simple profiling tool, very useful to novice
users
• DDT: parallel debugger, intuitive GUI
ivec.org
Team Composition
• Breakdown:
• 3 Computer Science/Games majors from ECU
• 2 Physics/Computer Engineering majors from
UWA
• 1 Geophysics major from Curtin
• Each student assigned areas of expertise (1
primary, 2 secondary)
• At beginning of training, I facilitated students’
development of team norms (standards of
behavior) that proved very effective
• No conflicts, no inappropriate behavior
ivec.org
III. CLUSTER DESIGN
“Crazy Straws,” http://xkcd.com/1095/
ivec.org
Cluster Design
• Designing cluster, generally the following
must be considered:
•
•
•
•
•
•
Cost
Space
Utility
Performance
Power Consumption
Cost
ivec.org
Cluster Design
• Designing cluster, generally the following
must be considered:
•
•
•
•
•
•
Cost
Space
Utility
Performance
Power Consumption
Cost
ivec.org
Cluster Design
• Architecture choices:
• All CPU nodes
• All accelerator nodes
• Hybrid CPU/accelerator
• Accelerator choices:
• NVIDIA Tesla
• Intel Xeon Phi
• Combination (?)
ivec.org
Cluster Architecture
• 2 Pyramid nodes
• 4 x NVIDIA K20X
• 2 x Intel Ivy Bridge
12-core 64 GB
• 8 Hollister nodes
• 2 x Intel Ivy Bridge
12-core 64 GB
• Infiniband
interconnect
• Stay within power
budget by running
only GPUs or
CPUs
ivec.org
Cluster Architecture
• Chose CPU/GPU hybrid architecture
• For good LINPACK performance
• Potential accelerated mystery app
• Maximize flops per watt performance
ivec.org
Cluster Software
• CentOS 6
• Ceph filesystem
• Open-source Software
• GCC
• OpenMPI
• Numerical libraries
•
•
•
•
FFTW
PETSc
NetCDF
HDF5
• Proprietary software
•
•
•
•
Intel compiler/MKL
Allinea DDT/MAP
PGI compiler
CUDA
ivec.org
Cluster Software: Ceph
• Each node has > 1TB disk, need parallel
filesystem
• Could use Lustre, however issues with losing
data if one node fails
• Ceph: distributed object store and file system
designed to provide excellent performance,
reliability and scalability
ivec.org
Cluster Software: Ceph
• Ceph object-storage system with traditional
file-system interface with POSIX semantics
• Looks like regular filesystem
• Directly mounted in recent CentOS kernel
• Underneath, Ceph keeps several copies of
files balanced across hosts
• Metadata server cluster can expand/contract to fit
file system
• Rebalance dynamically to distribute data (weighted
distribution if disks differ in size)
ivec.org
IV. APPLICATIONS
“TornadoGuard,” https://xkcd.com/937/
ivec.org
Applications
•
•
•
•
•
High-Performance LINPACK
Graphlab
NEMO5
WRF
Mystery Application – Flying Snakes!
ivec.org
HIGH-PERFORMANCE LINPACK
ivec.org
Linpack History
• Linear algebra library written in Fortran
• Benchmarking added in late 1980s to
estimate calculation times
• Initial releases used fixed matrix sizes 100
and 1000
• Arbitrary problem size support added in 1991
• LAPACK replaced the Linpack library for
linear algebra, however Linpack
benchmarking tool still used today
ivec.org
HPL Standard
• Released in 2000 re-written in C and
optimized for parallel computing
• Uses MPI and BLAS
• The standard benchmark used measure
supercomputer performance
• Used to determine the Top500
• Also used for stress testing and maintenance
analysis
ivec.org
CUDA HPL
• CUDA-accelerated Linpack released by
NVIDIA available on developer zone
• Uses the GPU instead of the CPU and limited
to GPU memory
• Popularity gaining with GPU providing better
flops/watt
• Standard for HPL runs in Student Cluster
Competitions
ivec.org
Student Cluster Competition
Linpack Scores
10
TF
1
0.1
2006
2007
2008
2009
2010
Year
2011
2012
2013
2014
ivec.org
GRAPHLAB
ivec.org
GraphLab
• Toolkit for graph algorithms
•
•
•
•
•
•
Topic Modelling
Graph Analytics
Clustering
Collaborative Filtering
Graphical Models
Computer Vision
ivec.org
Graphlab Applications
• Page rank (e.g., Google)
• Image reconstruction
• Recommendation predictions
(e.g., Netflix)
• Image stitching (e.g., panoramic photos)
ivec.org
NEMO5
ivec.org
NEMO5
• Stands for NanoElectronics MOdeling
Tools
• Free for academic use, not exactly open
source
• Evolved to current form over 15 years
• Developed by Purdue University
ivec.org
NEMO5
• NEMO5 designed to model at the
atomic scale
• Simulation of nanostructure
properties: strain relaxation, phonon
modes, electronic structure, selfconsistent Schrodinger-Poisson
calculations, and quantum transport
• E.g., modelling Quantum Dots
ivec.org
WRF
ivec.org
WRF
• Next-generation mesoscale numerical
weather prediction system
• Used for both weather prediction and
research forecasting, throughout the world
ivec.org
MYSTERY APPLICATION: FLYING
SNAKES!
ivec.org
Mystery Application
• Unknown application, presented at
competition
• To prepare, compiled and ran one new code
each week during 2nd semester
• Gained experience with different types of
compiles (e.g., edit makefiles, make.inc,
cmake, autoconf, etc.)
• Gained familiarity with common errors
encountered while compiling, and how to fix
them
ivec.org
Flying Snakes!
• Aerodynamics of flying snakes
• Flying snakes inhabit rainforest canopy in East
Asia & jump between tree branches, gliding to next
branch
• Case of fluid dynamics: behavior of air as
snake passes through, development of
vortices, eddies, etc.
• Modeled with OpenFOAM, open-source
computational fluid dynamics toolbox
ivec.org
V. COMPETITION
“Standards,” http://xkcd.com/927/
ivec.org
Competition
• Arrived in Denver Thursday before
competition, to acclimate to 15-hour time
difference
• Visited National Renewable Energy Laboratory to
see supercomputers
• Began setting up on Saturday before
competition
• Competition time: Monday evening –
Wednesday evening
• Wednesday evening: party at Casa Bonita
• Thursday: Pros vs. amateurs competition
• Friday: back home
ivec.org
Scenes from the Trip
ivec.org
Scenes from the Trip
ivec.org
Team Booth
ivec.org
Casa Bonita
ivec.org
Taking Down the Booth
ivec.org
Results
• Official champion: University of Texas (last
year’s champions too)
• Other rankings not given, but we were middle
of pack
• Entire team (including coach) learned a lot!
• Students have potential leads for jobs &
further study
• Plans to coach another team for 2014
ivec.org
Bibliography
• CentOS, http://www.centos.org
• Ceph, http://ceph.com/ceph-storage/file-system/
• High-Performance Linpack & HPCC,
http://icl.cs.utk.edu/hpcc/
• Graphlab, http://graphlab.org
• NEMO5,
https://engineering.purdue.edu/gekcogrp/softwar
e-projects/nemo5/
• WRF, http://www.wrf-model.org/index.php
• Krishnan et al., Lift and wakes of flying snakes,
http://arxiv.org/pdf/1309.2969v1.pdf
• OpenFOAM, http://www.openfoam.com
ivec.org
For More Information
• iVEC, http://www.ivec.org
• Student Cluster Competition,
http://www.studentclustercomp.com
Email: [email protected]

similar documents