Computational Grids for High Energy Physics Research

Report
Grid and its applications
Oxana Smirnova
Lund / CERN
NorduGrid/LCG/ATLAS
Reykjavik, November 17, 2004
Outlook





Grid vision and history
Grid necessity: demanding applications
Information Technology developments
Grid solutions
Development and deployment projects
2004-11-17
2
Grid vision and history
2004-11-17
3
From distributed resources …
Present situation:
• cross-national projects
• users and resources in different domains
• separate access to each resource
2004-11-17
4
… to World Wide Grid
Future:
• multinational projects
• resources location is irrelevant
• “plug-n-play” access to all the resources
2004-11-17
5
Grid history: users’ perspective

Metacomputing is a decades old idea
–
–

Previous attempt, including Condor, failed to
appeal to users
• Progress in commercial hardware has always been
faster than in Open Source-like middleware 
easier to buy a bigger supercomputer/cluster
Globus Toolkit 1 was heading into oblivion in
early 2000
Physicists in Europe and USA realized that the
time (Y2K) for metacomputing is ripe
–
–
–
–
MONARC project (CERN) developed a multitiered model for distributed analysis of data
Particle Physics Data Grid (PPDG) and GriPhyN
projects by US physicists started using Grid
technologies
Globus was picked up by the CERN-lead EU
DataGrid (EDG) project
EDG failed to satisfy user demands; many
simpler solutions appeared, triggered by
physicists:
• NorduGrid (Northern Europe and others)
• Grid3 (USA)
• GLite (EU, a prototype)
2004-11-17
6
Driven by High Energy Physics
2004-11-17
7
Large Hadron Collider:
World’s biggest accelerator at CERN
2004-11-17
8
Collisions at LHC
2004-11-17
9
ATLAS: one of 4 detectors at LHC
2004-11-17
10
ATLAS: preparing for data taking
2004-11-17
11
ATLAS simulation flow
Pythia
Bytestream
Raw
Digits
Events
HepMC
Geant4
Hits
MCTruth
Digitization
Events
HepMC
Geant4
Hits
MCTruth
Digitization
Events
HepMC
Geant4
Hits
MCTruth
Pile-up
Bytestream
Raw
Digits
Mixing
Digits
(RDO)
MCTruth
Events
HepMC
Hits
MCTruth
Digits
(RDO)
MCTruth
Digits
(RDO)
MCTruth
Event
generation
Physics
events
2004-11-17
Digits
(RDO)
MCTruth
Pile-up
24 TB
~2 TB
Detector
Simulation
Min. bias
Events
ESD
Bytestream
Raw
Digits
Reconstruction
Piled-up
events
Mixed events
Bytestream
Raw
Digits
Reconstruction
ESD
Bytestream
Raw
Digits
18 TB
75 TB
Digitization
(Pile-up)
ESD
Bytestream
Raw
Digits
Mixing
Geant4
Reconstruction
Byte stream
Mixed events
With
Pile-up
Event
Mixing
Volume of data
for 107 events
5 TB
Reconstruction
TB
Persistency:
Athena-POOL
12
Piling up events
2004-11-17
13
Characteristics of HEP computing
Event independence
– Data from each collision is processed independently: trivial
parallelism
– Mass of independent problems with no information exchange
Massive data storage
– Modest event size: 1 – 10 MB (although some are up to 1-2 GB)
– Total is very large – Petabytes for each experiment
Mostly read only
– Data never changed after recording to tertiary storage
– But is read often! A tape is mounted at CERN every second!
Resilience rather than ultimate reliability
– Individual components should not bring down the whole system
– Reschedule jobs on failed equipment
Modest floating point needs
– HEP computations involve decision making rather than calculation
2004-11-17
14
Very demanding tasks

Data-intensive tasks
–
–
–
–

Large datasets, large files
Lengthy processing times
Large memory consumption
High throughput is necessary
Very distributed user base
– Distributed computing
resources of modest size
– Produced and processed data
are hence distributed, too
– Issues of coordination,
synchronization and
authorization are outstanding

HEP is by no means unique in
its demands, but they are first,
they are many, and they badly
need it
2004-11-17
15
Other applications

Medical and biomedical:
–
–
–
Image processing (digital X-ray
image analysis)
Simulation for radiation therapy
Protein folding
–
–
–
Quantum
Organic
Polymer modelling
–
High Energy and other accelerator
physics
Theoretical physics, lattice
calculations of all sorts
Neutrino physics
Combustion

Chemistry



Climate studies
Space sciences
Physics:
–
–
–



Genomics
Material sciences
Even warfare
2004-11-17
16
IT perspective
2004-11-17
17
IT progress: some facts
 Network vs. computer
performance:
– Computer speed doubles
every 18 months
– Network speed doubles
every 9 months
 1986 to 2000:
– Computers: 500 times
faster
– Networks: 340000 times
faster
 2001 to 2010 (projected):
– Computers: 60 times faster
– Networks: 4000 times
faster
Bottom line: CPUs are fast enough; networks are very fast –
2004-11-17
gotta make use of it!
Slide adapted from the Globus Alliance
18
The Grid Paradigm
 Distributed
supercomputer, based on
commodity PCs and fast
WAN
 Access to the great
variety of resources by a
single pass – certificate
 A possibility to manage
distributed data in a
synchronous manner
 A new commodity
Drainage
Supercomputer
The Grid
Water
Electricity
Radio/TV
Internet
PC Farm
Workstation
Grid
2004-11-17
19
Wider scope: a Grid System
A Grid system is a collection
of distributed resources
connected by a network
Examples of Distributed Resources:
 Desktop
 Handheld hosts
 Devices with embedded processing resources
such as digital cameras and phones
 Tera-scale supercomputers
2004-11-17
Slide adapted from A.Grimshaw
20
Characteristics of a generic Grid system
Numerous Resources
Ownership by Mutually
Distrustful Organizations
& Individuals
Connected by
Heterogeneous,
Multi-Level Networks
Different Security
Requirements
& Policies Required
Different Resource
Management
Policies
Potentially Faulty
Resources
Geographically
Separated
Resources are
Heterogeneous
2004-11-17
Slide adapted from A.Grimshaw
21
Grid paradigm is overloaded
Global Grids
 Multiple enterprises, owners, platforms,
domains, file systems, locations, and
security policies
 Legion, Avaki, Globus
Enterprise “Grids”
Desktop Cycle
Aggregation
 Desktop only
 United Devices,
Entropia, Data Synapse
Cluster &
Departmental
“Grids”
 Single owner, platform,
domain, file system and
location
 SUN SGE, Platform LSF, PBS
2004-11-17
Graph borrowed from A.Grimshaw
 Single enterprise; multiple owners,
platforms, domains, file systems,
locations, and security policies
 SUN SGE EE, Platform Multicluster
WARNING! Not everything that has
“G” in the name is Grid!
(SGE, Oracle 10g, Condor-G etc)
22
Implementations
2004-11-17
23
Globus: the toolkit provider
Grid features:
The first and only provider of a
Grid toolkit (libraries and API)
– An academic research project in
USA and now Europe
– Free software, open code
– Supports Grid testbeds since late
90’s
To do:
• Heterogeneous
• Non-interactive
• Single logon
• Optimized file
transfer protocol
• Information schema
• Global resource management
• Data management
• User management, accounting
2004-11-17
24
The Globus Toolkit v2 in One Slide
 Grid protocols (GSI, GRAM, …) enable resource sharing within
virtual organizations; toolkit provides reference implementation
(
= Globus Toolkit 2 services)
User
MDS-2
(Monitoring and
Discovery Service)
Reliable
remote
invocation
GSI
(Grid Security
Infrastructure) Authenticate &
create proxy
credential
User
process #1
Proxy
Gatekeeper
(factory)
Create process
Reporter
(registry +
discovery)
Register
User
process #2
Proxy #2
GRAM
(Grid Resource Allocation & Management)

Soft state
registration;
enquiry
Other GSIauthenticated
remote service
requests
GIIS: Grid
Information
Index Server
(discovery)
Other service
(e.g. GridFTP)
Protocols (and APIs) enable other tools and services for
membership, discovery, data management, workflow, …
2004-11-17
Slide adapted from the Globus Alliance
25
Globus-Based Grid Tools & Applications
 Data Grids
– Distributed management of large quantities of data:
physics, astronomy, engineering
 High-throughput computing
– Coordinated use of many computers
 Collaborative environments
– Authentication, resource discovery, and resource access
 Portals
– Thin client access to remote resources & services
 And combinations of the above
2004-11-17
Slide adapted from the Globus Alliance
26
Some architectural thoughts
Data location
server
Storage
User
Interface
Workload
manager
User
Interface
User
Interface
2004-11-17
Information
Server
Storage
27
Some Grid projects
(past and present)
US projects
2004-11-17
Slide adapted from Les Robertson
European projects
28
Some Grid projects timeline
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
LCG
EDG
GriPhyN, PPDG
EGEE
VDT
CROSSGRID
DataTAG
NorduGrid
Globus

GT2
GT3
GT4
Other Grid-related projects do not develop Open Source-like (i.e., free)
software/middleware, as of today
–
–
–
–
Most notably, Legion/Avaki: Globus competitor, widely used by businesses
Entropia: like [email protected]
IBM, Platform: Globus-based
Sun Grid Engine EE: enterprise Grids
2004-11-17
29
What Grid can do today
 Simplest Grid: users access
distributed resources using
a single certificate
 More complex Grid: users’
tasks are distributed
between different resources
by a broker
 Even more complex Grid:
not only tasks, but massive
amounts of data are also
distributed and managed
(not quite there yet, only
prototypes
2004-11-17
???
SE
MSS
MSS
???
???
Broker(s)
Broker(s)
??? ???
SE
30
What is missing
 Common policies, or ways of mutually
respecting such
 Grid accounting systems and Grid economy
 Serious security solutions; role-based
access control
 Full-blown distributed data management
systems
 Tools and methods for system-wide
applications environment deployment
 STANDARDS!
2004-11-17
31
Functionality, standardization
The emergence of Open Grid standards
Managed shared
virtual systems
Computer science research
Web services, etc.
Internet
standards
Custom
solutions
1990
OGSA, WSRF
Real standards
Multiple implementations
Globus Toolkit
Defacto standard
Single implementation
1995
2004-11-17
Slide adapted from the Globus Alliance
2000
2005
2010
32
The Grid or many Grids?

Globus Toolkit 2 is a basis for great many Grid solutions
– Which use some common tools and utilities: GSI, GridFTP
– But they also differ a lot, architecturally and technologically
– There are several non-interoperable GT2-based Grid systems!
• No satisfactory ready-made solutions  developers invent their own
• Being financed from different sources, developers and users are not always
encouraged to adopt rival project’s solution
• Instead of “How should I use Grid?”, users ask “Which Grid should I use?”

Grid standards body: Global Grid Forum (GGF)

Globus introduced the “Open Grid Services Architecture” (OGSA)

New step by Globus: “Web Services Resource Framework” (WSRF)
– Heavily oriented towards commercial implementations
– No effective standards since 2001
– Not yet used by any of the development projects
– Perhaps the first set of standards endorsed by GGF
– Globus Toolkit 3 is released
– We face Globus Toolkit 4 very soon…
2004-11-17
34
Meanwhile: ATLAS Production System
uses 3 Grids
Don Quijote
AMI
dms
prodDB
Windmill
super
jabber
LCG
exe
super
super
jabber
soap
LCG
exe
LCG
jabber
G3
exe
Dulcinea
RLS
super
soap
NG
exe
Lexor
2004-11-17
super
LSF
exe
Capone
RLS
NG
RLS
Grid3
LSF
35
Conclusion
 HEP community stirred a world-wide Grid interest
– Next big thing after the dot-com?..
 Despite a slow start and much hype, some real
work is under way
– Rather, the next big thing after the WWW !
 Still, no complete solution exists
–
–
–
–
Data management?
Accounting?
Security?
Standardization?
 With courage and patience, we should go Grid
2004-11-17
36

similar documents