資工系網媒所NEWS實驗室I/O Device Virtualization

Report
前瞻資訊科技
-虛擬化 (2)
-Virtualization(V12N)
薛智文
[email protected]
http://www.csie.ntu.edu.tw/~cwhsueh/
100 Fall, Nov 4, Fri 678, DTH 104
國立台灣大學
資訊工程學系
Outline
Introduction
Xen
Architecture
Hypercall
CPU Virtualization
Memory Virtualization
I/O Device Virtualization
Hardware Virtual Machine
Benchmark
Domain 1
Summary
1 /28
資工系網媒所
NEWS實驗室
How to Virtualize ?
Binary translation
 Hypercall

Full Virtualization
Para Virtualization
Hardware Assisted Virtualization
Trap and emulate
Intel VT-x & AMD SVM
2 /28
資工系網媒所
NEWS實驗室
Virtual Machine Monitor (VMM)
Hypervisor
VM : Virtual Machine,
Guest OS + Virtual Devices
VM0
VM1
…
VMN
VM0
VM1
…
VMN
Hosted VMM, e.g. VMware
Hypervisor, e.g. Xen
Host Operating System
Hardware
Hardware
Type I - Hypervisor
Type II – Hosted VMM
3 /28
資工系網媒所
NEWS實驗室
Hypervisor (VMM) Type
Type I + Microkernel
Type I
Xen (open source, Citrix),
Microsoft Hyper-V
Type I + Integrated kernel
VMware ESX,
KVM (kernel-base VM)
Type II
Type II (Host OS + Guest OS)
VMware GSX, workstation,
Microsoft virtual PC,
Microsoft virtual server,
Sun Virtual Box
4 /28
資工系網媒所
NEWS實驗室
Xen Architecture (1/2)
Domain 0
Domain U
Domain U
Domain U
5 /28
資工系網媒所
NEWS實驗室
Xen Architecture (2/2)

Compare to common Linux
Linux
Xen
System Calls
Hyper Calls
Signals
Events
Interrupts
Physical + Virtual Interrupts
CPU
PCPU + VCPU
Filesystem
XenStore
POSIX Shared Memory Grant Tables/Shared Pages
6 /28
資工系網媒所
NEWS實驗室
Hyper Call
System Call
int 0x80
int 0x82
Guest OS
01
02
03
04
05
06
07
Hypervisor
// linux/include/asm/unistd.h
#define
#define
#define
#define
…
__NR_restart_syscall
__NR_exit
__NR_fork
__NR_read
0
1
2
3
HYPERVOSIR_sched_op
int 82h
hypercall
Hypercall_table
do_sched_op
iret
resume Guest OS
Hyper Call
01
02
03
04
05
06
07
// xen/include/public/xen.h
#define
#define
#define
#define
…
__HYPERVISOR_set_trap_table
__HYPERVISOR_mmu_update
__HYPERVISOR_set_gdt
__HYPERVISOR_stack_switch
7 /28
0
1
2
3
資工系網媒所
NEWS實驗室
Grant Table
Page mapping & Page transferring
Page as a unit
Grant reference (GR)  Grant entry
Domain A
create GR
Domain B
send GR
map page
access page
unmap page
Domain A
transfer
page
Domain B
send
GR
inform
create
GR
receive page
release GR
release GR
inform
8 /28
資工系網媒所
NEWS實驗室
Event Channel
A lightweight signal mechanism
Use “ports” as identifers (pending+mask)
Four major purposes
Guest OS
Guest OS
…
IDC
VCPU
IPI
VCPU
VCPU …
vIRQ
Hypervisor
Hardware
IPI
VCPU …
pIRQ
Virtual CPU
Virtual
Memory
Physical
CPU
Physical
Memory
…
Scheduling
Eth0
Eth1
9 /28
…
資工系網媒所
NEWS實驗室
CPU Virtualization
Architecture
App
App
Guest OS
Guest OS
…
Hypervisor
VCPU
VCPU
VCPU
…
Scheduling
PCPU
PCPU
PCPU
…
2 scheduling algorithms (Non-Work Conserving)
Simple Earliest Deadline First (SEDF)
Credit
10 /28
資工系網媒所
NEWS實驗室
Interrupt
Physical interrupt
For the hypervisor or for guest OSes
Virtual interrupt
Ask guest OSes to do
8 for now (max is 24)
Guest OS
Guest OS
…
event
OS
Hypervisor
ISR
Hardware
Hardware
PIC
Device
IRQn
PIC
Device
IRQn
11 /28
資工系網媒所
NEWS實驗室
Memory Virtualization (1/2)
Two-level memory
Three-level memory
Virtual, Pseudo-physical, Machine
hypervisor
Application
- Virtual Memory
Guest OS
OS
-Physical
-Pseudo-Physical
Memory Memory
P2M
M2P
Hypervisor
-Machine Memory
12 /28
資工系網媒所
NEWS實驗室
Memory Virtualization (2/2)
168M memory for hypervisor
0xFC000000
0xFC400000
Area
Size
MPT, Machine-to-Physical Translation Table (RO)
16M
Page-Frame Information
96M
MPT, Machine-to-Physical Translation Table (R/W) 16M
Heap
0xFFFFFFFF
Linear Page Table
8M
Shadow Linear Page Table
8M
Per Domain Mappings
8M
Direct Map
12M
I/O Remap
4M
13 /28
資工系網媒所
NEWS實驗室
Memory Virtualization
- Translation
4 mechanisms to manipulate page tables
Paravirtualized page tables
Write page tables (Only level 1 is writable)
Shadow page tables
Hardware-assisted paging
Virtual Memory
Page Table
MMU (VM->PFN)
Page Fault !
Shadow Page Table
(VM->MFN or VM->P2M)
Pseudo-Physical Memory
P2M
Second Level Paging
HAP
Machine Memory
14 /28
資工系網媒所
NEWS實驗室
Memory Virtualization
- Shared Info Page
Structure
MAX : 32 VCPUs
event channel
TSC
memory
wall clock
Compare with start_info_page
Start Info Page
Mapped by
Information
Shared Info Page
Domain Builder
Guest OS
Static
Dynamically Updated
15 /28
資工系網媒所
NEWS實驗室
I/O Device Virtualization
Hypervisor also provides three mechanisms to use
devices.
Emulated Devices
Paravirtualized Driver
Pass-through
16 /28
資工系網媒所
NEWS實驗室
I/O Device Virtualization
- Emulated Devices
Implemented by QEMU
e.g. sound card, ac97, sb16, etc
QEMU-DM
17 /28
資工系網媒所
NEWS實驗室
I/O Device Virtualization
- Paravirtualized Driver
Split Device Driver Model
An example of sending packets
Back-End Driver
Front-End Driver
Native Driver
18 /28
資工系網媒所
NEWS實驗室
I/O Device Virtualization
- I/O Ring
Without data, it only transfers request/reply
An example with GR
Dom U
Dom 0
GR
GR
GR
Grant Table
I/O Channel
Hypervisor
Active Grant Table
Device
19 /28
資工系網媒所
NEWS實驗室
I/O Device Virtualization
- Pass-Through
Pass and directly use the device
Dom U
Dom 0
…
Native
Driver
Hypervisor
Hardware
Dom U
Virtual CPU
Virtual
Memory
Physical
CPU
Physical
Memory
Native
Driver
Scheduling
Eth0
Eth1
20 /28
…
…
資工系網媒所
NEWS實驗室
Hardware Virtual Machine
Intel Virtualization Technology
Technology
Description
Virtualization
Implementation
VT-x
Root/NonRoot
CPU, Memory
Extended Page Tables
VT-i
As VT-x, for Itanium
VT-d
DMA, Interrupt
Devices
VT-c
Classify Packets
Network Devices VMDq, VMDc
Instructions Set
IOMMU (Chipset)
21 /28
資工系網媒所
NEWS實驗室
CPU Benchmark (1/2)
8.3%
Average over 100 tests, Deviation: 0.066~0.128%
22 /28
資工系網媒所
NEWS實驗室
CPU Benchmark (2/2)
5%
Calculate the 32M digits of ∏.
23 /28
資工系網媒所
NEWS實驗室
Hard Disk Drive Benchmark
24 /28
資工系網媒所
NEWS實驗室
Network Benchmark (1/2)
59%
Testing Time: 180 seconds, Deviation: 0.12~0.26%.
25 /28
資工系網媒所
NEWS實驗室
Network Benchmark (2/2)
Average: 9.82%
Sample Period: 2 seconds
26 /28
資工系網媒所
NEWS實驗室
Answers for Big Questions
How fast can virtualization achieve?
95+%  99.9%
What kinds of applications?
Well …
What problems it might incur?
Technical
Data
Security
Business
Politics
Globalization (G11N) =
Internationalization (I18N) + Localization (L10N)
…
27 /28
資工系網媒所
NEWS實驗室
Summary
Stay hungry to be full [of passion].
Stay foolish to be smart [on absorption].
假若真時真亦假
Virtualized reality.
Real virtualization.
Virtualized to go anywhere.
Key is the system.
System is the key.
E.g. Virtual Tape Library
28 /28
資工系網媒所
NEWS實驗室

similar documents