Trends and innovations in Storage Fabrics

Report
THREE SCENARIOS WHO DRIVES
ADMIN’S CRAZY…
AND HOW TO HANDLE THEM
Ing. Thomas Mitrovits, MSc
Sr. Systems Engineer
1
covered topics
• contaminated infrastructure
• two challenges for long distance considerations
(aka frame handling)
• slow drainer
contaminated infrastructure
ClearLink Diagnostics Functional Details
Test Initiator
(switch port)
• D_Port test consists of following four steps:
• Electrical loopback test (E-WRAP)
• Optical loopback test (O-WRAP)
• Link traffic test
• Link latency and distance measurement
Test Responder
(device port)
Validating Configurations
• Use ClearLink diagnostic port (D_Port) mode to test all 16 Gbps-capable ISLs, ICLs, and Brocade HBA
connections
• Complete optical, electrical and link saturation testing to ensure reliable connections
• Pre-test and validate the entire SAN fabric at full line rate and with full FOS features enabled using the
integrated flow generator
• Emulate a 16 Gbps SAN without having any 16 Gbps hosts, targets or SAN testers
H1
T1
H2
T2
D-Port Test Results via CLI
sw0:root> portdporttest --show 10/39
D-Port Information:
===================
Slot:
10
Port:
39
Remote WWNN:
10:00:00:05:33:7e:69:c4
Remote port:
24
Mode:
Manual
No. of test frames:
12 Million
Duration of test (HH:MM):
00:01
Test frame size:
1024 Bytes
Payload Pattern:
JTSPAT
FEC (enabled/option/active): Yes/No/No
CR (enabled/option/active):
No/No/No
Start time:
Mon Jan 16 05:57:51 2012
End time:
Mon Jan 16 05:58:56 2012
Status:
FAILED
================================================================================
Test
Start time
Result
EST(HH:MM:SS) Comments
================================================================================
Electrical loopback
05:57:52
PASSED
----------------Optical loopback
05:58:06
PASSED
----------------Link traffic test
05:58:13
FAILED
-------See failure report
================================================================================
Roundtrip link latency:
934 nano-seconds
Estimated cable distance:
1 meters
Buffers required:
1 (for 1024 byte frames at 16Gbps speed)
Failure report:
Errors detected (local):
Errors detected (remote):
CRC, Bad_EOF, Enc_out
CRC, Bad_EOF
Please use portstatsshow and porterrshow for more details on the above errors.
D-Port test results
show pass/fail as well
as reason for failure to
accelerate
troubleshooting
Long Distance (aka Buffer-Credit-Handling)
Important Numbers … Numbers … Numbers
5
µs latency per km fiber
25
km maximum distance
with 16 Gbit FC SFPs
125
m is the maximum distance with
16 Gbit/s and OM4 cabling
250
m length is a
FC frame @16 Gbit/s
1st challenge: the physics
Attenuation (dB/km)
Fiber Optics Transmission Window
1300nm
=0.5dB/km
850
Wavelength (nm)
1300
1550nm
=0.2dB/km
1550
Available SFP+
Optical Small Form-factor Pluggable (SFP+) transceivers
are available in short- and long-wavelength types:
16G SWL
Brocade
57-0000088-01
16G LWL - 10km
Brocade
57-0000089-01
16G ELWL 25km
Brocade
57-1000262-01
Optical cable length for Multimode fiber
Optical cable length for Fibre Channel
OM1
OM2
OM3
OM4
Protocol
(FC)
Encoding
Line Rate
(Gb/sec)
OM1 - 62.5µ
(200 mHz)
Multimode
OM2 - 50µ
(500 mHz)
Multimode
OM3 - 50µ
(2000 mHz)
Multimode
OM4 - 50µ
(4700 mHz)
Multimode
1G
8b10b
1.0625
300
500
860
2G
8b10b
2.125
150
300
500
4G
8b10b
4.25
70
150
380
400
8G
8b10b
8.5
21
50
150
200
10G
64b66b
10.53
33
82
300
300
16G
64b66b
14.025
10.5
25
100
125
SFP specifications
Possible Budget
Real Budget
-24dBm
-20,5dBm -15dBm
--9,5dBm
--5dBm --3dBm
Power Budget = (Worst
Case Launch Power) –
(Worst Case Receiver
Sensitivity) + (Connector
Attenuation)
FCIP - extension without limits ?
• use of existing IP wide area network (WAN) infrastructure
to connect Fibre Channel SANs.
• No implicit distance limit.
• The TCP connections ensure in-order delivery of FC frames and lossless
transmission.
• All Fibre Channel targets and initiators are unaware of the presence of the
IP WAN.
2nd challenge: Flow Control
Flow Control
Credit exchange at Fabric Login
Host says, “I
can receive
40 frames.”
Storage says, “I
can receive 16
frames.”
Switch says, “I can
receive 8 frames.”
Buffer Credits
Credit accounting after Fabric Login
Switch thinks, “OK, I
can send 40 frames
that way and 16
frames this way, but I
have to think about it.”
Host thinks, “Good,
I can send 8
frames without
thinking about it.”
Credit Count
8
Credit Count
40
Credit Count
16
Storage thinks,
“Good, I can send 8
frames without
thinking about it.”
Credit Count
8
Buffer Credits
Frame 1
1km
Frame 1
1km
1km
1km
1km
1km
1km
1km
Buffer Credits
Frame 1
Frame 2
1km
1km
Frame 2
Frame 1
1km
1km
1km
1km
1km
1km
Buffer Credits
Frame 1
Frame 2
Frame 3
1km
1km
1km
Frame 3
Frame 2
Frame 1
1km
1km
1km
1km
1km
Buffer Credits
Frame 1
Frame 2
Frame 3
Frame 4
1km
1km
1km
1km
Frame 4
Frame 3
Frame 2
Frame 1
1km
1km
1km
1km
Buffer Credits
Frame 1
Frame 2
Frame 3
Frame 4
Frame 5
1km
1km
1km
1km
1km
Frame 5
Frame 4
Frame 3
Frame 2
Frame 1
1km
1km
1km
Buffer Credits
Frame 1
Frame 2
Frame 3
Frame 4
Frame 5
Frame 6
1km
1km
1km
1km
1km
1km
Frame 6
Frame 5
Frame 4
Frame 3
Frame 2
Frame 1
1km
1km
Buffer Credits
Frame 1
Frame 2
Frame 3
Frame 4
Frame 5
Frame 6
Frame 7
1km
1km
1km
1km
1km
1km
1km
Frame 7
Frame 6
Frame 5
Frame 4
Frame 3
Frame 2
Frame 1
1km
Buffer Credits
Frame 1
Frame 2
Frame 3
Frame 4
Frame 5
Frame 6
Frame 7
Frame 8
1km
1km
1km
1km
1km
1km
1km
1km
Frame 8
Frame 7
Frame 6
Frame 5
Frame 4
Frame 3
Frame 2
Frame 1
Buffer Credits
Frame 1
Frame 1
Frame 2
Frame 3
Frame 4
Frame 5
Frame 6
Frame 7
Frame 8
Frame 9
1km
1km
1km
1km
1km
1km
1km
1km
Frame 9
Frame 8
Frame 7
Frame 6
Frame 5
Frame 4
Frame 3
Frame 2
ACK 1
Buffer Credits
Frame 1
Frame 1
Frame 2
Frame 2
Frame 3
Frame 4
Frame 5
Frame 6
Frame 7
Frame 8
Frame 9
1km
1km
1km
1km
1km
1km
1km
1km
Frame A
Frame 9
Frame 8
Frame 7
Frame 6
Frame 5
Frame 4
Frame 3
ACK 1
ACK 2
Frame A
Buffer Credits
Frame 1
Frame 1
Frame 2
Frame 2
Frame 3
Frame 3
Frame 4
Frame 5
Frame 6
Frame 7
Frame 8
Frame 9
1km
1km
1km
1km
1km
1km
1km
1km
Frame B
Frame A
Frame 9
Frame 8
Frame 7
Frame 6
Frame 5
Frame 4
ACK 1
ACK 2
ACK 3
Frame A
Frame B
Buffer Credits
Frame 1
Frame 1
Frame 2
Frame 2
Frame 3
Frame 3
Frame 4
Frame 4
Frame 5
Frame 6
Frame 7
Frame 8
Frame 9
1km
1km
1km
1km
1km
1km
1km
1km
Frame C
Frame B
Frame A
Frame 9
Frame 8
Frame 7
Frame 6
Frame 5
ACK 1
ACK 2
ACK 3
ACK 4
Frame A
Frame B
Frame C
Buffer Credits
Frame 1
Frame 1
Frame 2
Frame 2
Frame 3
Frame 3
Frame 4
Frame 4
Frame 5
Frame 5
Frame 6
Frame 7
Frame 8
Frame 9
1km
1km
1km
1km
1km
1km
1km
1km
Frame D
Frame C
Frame B
Frame A
Frame 9
Frame 8
Frame 7
Frame 6
ACK 1
ACK 2
ACK 3
ACK 4
ACK 5
Frame A
Frame B
Frame C
Frame D
Buffer Credits
Frame 1
Frame 1
Frame 2
Frame 2
Frame 3
Frame 3
Frame 4
Frame 4
Frame 5
Frame 5
Frame 6
Frame 6
Frame 7
Frame 8
Frame 9
1km
1km
1km
1km
1km
1km
1km
1km
Frame E
Frame D
Frame C
Frame B
Frame A
Frame 9
Frame 8
Frame 7
ACK 1
ACK 2
ACK 3
ACK 4
ACK 5
ACK 6
Frame A
Frame B
Frame C
Frame D
Frame E
Buffer Credits
Frame 1
Frame 1
Frame 2
Frame 2
Frame 3
Frame 3
Frame 4
Frame 4
Frame 5
Frame 5
Frame 6
Frame 6
Frame 7
Frame 8
Frame 9
1km
1km
1km
1km
1km
1km
1km
1km
Frame F
Frame E
Frame D
Frame C
Frame B
Frame A
Frame 9
Frame 8
ACK 1
ACK 2
ACK 3
ACK 4
ACK 5
ACK 6
ACK 7
Frame A
Frame B
Frame C
Frame D
Frame E
Frame F
Frame 7
Buffer Credits
Frame 1
Frame 1
Frame 2
Frame 2
Frame 3
Frame 3
Frame 4
Frame 4
Frame 5
Frame 5
Frame 6
Frame 6
Frame 7
Frame 8
Frame 9
1km
1km
1km
1km
1km
1km
1km
1km
Frame G
Frame F
Frame E
Frame D
Frame C
Frame B
Frame A
Frame 9
ACK 1
ACK 2
ACK 3
ACK 4
ACK 5
ACK 6
ACK 7
ACK 8
Frame A
Frame B
Frame C
Frame D
Frame E
Frame F
Frame G
Frame 7
Frame 8
Buffer Credits
Frame H
Frame 1
Frame 2
Frame 2
Frame 3
Frame 3
Frame 4
Frame 4
Frame 5
Frame 5
Frame 6
Frame 6
Frame 7
Frame 8
Frame 9
1km
1km
1km
1km
1km
1km
1km
1km
Frame H
Frame G
Frame F
Frame E
Frame D
Frame C
Frame B
Frame A
ACK 2
ACK 3
ACK 4
ACK 5
ACK 6
ACK 7
ACK 8
ACK 9
Frame A
Frame B
Frame C
Frame D
Frame E
Frame F
Frame G
Buffer Credit Frame1 can be released now !
Frame 7
Frame 8
FCP frame control (i.e. SCSI-FCP Write Command)
Initiator
Switch
Target
sequence
sequence
sequence
sequence
lost ack_frames (aka performance)
What happens if ack_frames get lost
if BB=0 (i.e. lost r_rdy
frames) the link will be
reseted by sending
LinkCreditReset (LR) and
LinkCreditResetResponse
(LRR).
Automatically
recovers flow control
buffer credit loss at
the VC level,
improving availability
t
Buffer Credit
Recovery
FCP frame control cont.
• Data Droop
Bandwidth – distance Extension
• Remove Data droop – adding Buffer-to-Buffer Credits
FCP frame control cont.
Bandwidth – distance Extension cont.
• Remove Data Droop => Terminating Buffer-to-Buffer Credits
How “long” is a frame?
Traveling at the speed of light = 300.000 km/s in vacuum
(approx. 65% in fiber) a frame can be very short…
@ 1G
a frame is about 4Km in length
@
2G a frame is about 2Km in length
@
4G a frame is about 1Km in length
@
8G a frame is about 0.5Km in length
@
16G a frame is about 250 m in length
How much credit do I need?
Good “Rule of thumb”
Number of credits needed = 1 + Link speed in Gb/s * Distance in Km
Frame Size in K
4Gb
Example: 20 Km at 1 Gb/s
32 Credits
110
Example: 10 Km at 4 Gb/s
1 + 4 * 10 = 21
2
100
P e rce n t D a ta R a te
1 + 1 * 20 = 11
2
90
80
70
60
50
40
30
20
10
4
20
36
52
68
84
Distance (Km)
100
116
132
148
Performance Optimization on FC Long Distance ISLs
Optimize Performance
• Allow end users to specify either the number of buffers or
average frame size while configuring a long distance port
• Provides more control to users to optimize performance on long distance
links based on traffic pattern
• Two new options for Portcfglongdistance CLI - one option to configure
buffers, and another option to configure frame size for LD and LS modes
• In pre-FOS v7.1, user can configure only the “distance” for long-distance static and dynamic mode.
Buffer estimation done based on distance, link speed and full_size frame buffers assumed, which can
lead to suboptimal buffer allocation
• With FOS v7.1, a user can directly configure the buffers required for a port of a long distance link
• Users can also configure/specify the average frame size for a long distance port. Using the frame size
option, number of buffers/credits required for a port will be automatically calculated
Performance Optimization on FC Long Distance ISLs
Optimize Performance
• Enhancement to display the average buffer usage and average frame
size in portbuffershow
• Average buffer usage is the real time buffers used by the port while the traffic
is in progress
• Provides better insights into the traffic pattern and also lets users optimize
performance on long distance links by specifying the average frame size
• A new CLI portBufferCalc to calculate the number of buffers required per
port given the distance, speed and frame size
• If a user does not provide any of the options, then current port’s configuration
will be considered to calculate the number of buffers required
• This CLI will give the users an estimate on the number of buffers required for
given distance, speed and frame size
Portcfglongdistance Example
• -distance & -framesize
pluto_134:FID128:root> portcfglongdistance 1/3 LS 1 -distance 100 -framesize 1024
Reserved Buffers =
806
Warning: port (3) may be reserving more credits depending on port speed.
pluto_134:FID128:root> portcfgshow 1/3
Speed Level:
AUTO(HW)
Fill Word(On Active) 1(Arbff-Arbff)
Fill Word(Current)
1(Arbff-Arbff)
AL_PA Offset 13:
OFF
Trunk Port
ON
Long Distance
LS
VC Link Init
ON
Desired Distance
100 Km
Frame Size
1024 Bytes
Reserved Buffers
806
Portcfglongdistance Example
• -buffers
pluto_134:FID128:root> portcfglongdistance 1/3 LS 1 -buffers 400
Reserved Buffers =
406
Warning: port (3) may be reserving more credits depending on port speed.
pluto_134:FID128:root> portcfgshow 1/3
Area Number:
3
Speed Level:
AUTO(HW)
Fill Word(On Active) 1(Arbff-Arbff)
Fill Word(Current)
1(Arbff-Arbff)
AL_PA Offset 13:
OFF
Trunk Port
ON
Long Distance
LS
VC Link Init
ON
Desired Buffers
400
Reserved Buffers
406
Portbuffercalc – New CLI
• This CLI is used as an assistance to configure the recommended buffers for
a longdistance port.
• It returns the buffers based on the distance/speed/framesize configured.
• CLI
• portBufferCalc [SlotNumber/]PortNumber <-distance distance> <-speed speed>
<-framesize framesize>
• Example
pluto_134:FID128:root> portbuffercalc 1/3 -distance 100
406 buffers required for 100km at 8G and framesize of 2048bytes
Buffer Credits
Switch or
blade model
Total FC ports
per switch or
blade
User port
group size
Unreserved buffers with QoS
(per port group)
Unreserved buffers without
QoS (per port group)
6510 switch
48
48
6752
7712
FC16-32
32
16
5188
5408
FC16-48
48
24
4480
4960
FC8-64
*** Extended Fabrics are not supported on this blade ***
Maximum distances (km) that can be configured assuming 2112 Byte Frame Size
2 Gbps
4 Gbps
8 Gbps
10 Gbps
16 Gbps
6510 switch
6752
3376
1688
1350
844
FC16-32
5188
2594
1297
1037
648
FC16-48
4484
2242
1121
896
560
FC8-64
*** Extended Fabrics are not supported on this blade ***
Bottleneck Detection
Server 1
Storage A
FC Switch 1
FC Switch 2
Server 2
FC Switch 1
FC Switch 2
Storage B
Server 3
FC Switch 1
FC Switch 2
Storage B
Storage C
Server 4
Storage D
FC Switch 1
FC Switch 2
Storage B
1 Server to 1 Storage with bottleneck
Server 2
FC Switch 2
FC Switch 1
Storage B
Bottleneck
Bottleneck
Bottleneck
Server 1
Server 2
Bottleneck
Bottleneck
Bottleneck
Storage A
Server 3
Server 4
FC Switch 1
FC Switch 2
Storage B
Storage C
Storage D
Bottlenecks in general
• “Bottleneck” is an attribute of the transmit direction of a port
• (The transmit direction of) a port is bottlenecked when the offered load at
the port exceeds the throughput at the port
• A port can be a congestion bottleneck or a latency (aka slow-drain)
bottleneck
• Congestion bottleneck: offered load exceeds throughput and throughput is
100%
• Latency bottleneck: offered load exceeds throughput and throughput is less
than 100%
Latency bottlenecks and throughput
• Common misconception that a latency bottleneck (“slow-drain”) must be
doing low throughput
• A latency bottleneck can have any link utilization level from 0% to under
100%
• Not necessarily low utilization/throughput
• Looking at slow drain at high utilizations is not very useful
• Feature is not recommended above 85% link utilization
Handling of trunks
• Congestion bottlenecks
• Entire capacity and entire utilization of the trunk are considered to determine if
it is congested
• Reporting and configuration are done on the master only
• Reporting and configuration follow the master
• Latency bottlenecks
• Any bottlenecked VC on the trunk makes the trunk a bottleneck
• Reporting and configuration are done on the master only
• Reporting and configuration follow the master
RAS: Bottleneck Detection
Maintaining Application Performance
• Identifies and alerts
administrators to bottlenecks
that can degrade application
performance
ISL
Congestion
Congestion Bottleneck
Monitor , E_Port
• Detects bottlenecks caused
by slow drain devices
• Bottleneck detection for E_,
EX_, and F_Ports
• Accelerates problem detection
and diagnosis to minimize
performance degradation
Congestion
Bottleneck Monitor,
E_Port
Latency
Bottleneck
Monitor, F_Port
Slow-drain
Device
Normal Traffic
Congested Traffic
Supported Configurations
• Condor/Condor2/GoldeEye/GoldenEye2/Condor3 ports
• Latency bottleneck detection on Condor/GoldenEye is an approximation of the
more exact mechanism available on Condor2/GoldenEye2
• Does not catch all latency bottlenecks on Condor/GoldenEye
• Runs on all platforms
• Works the same on switch or Access Gateway
• Feature is allowed on switch F_Port attached to Access Gateway
License, Conflicts
• No license requirement
• No conflicts with other features
Bottleneck detection
• User can configure bottleneck detection parameter on switch, port.
• User can view bottleneck statistics for a given port (max up to 32
ports)
• Bottlenecked port is highlighted in connectivity map and product tree
within 10 secs of switch detecting bottleneck
• User can see affected hosts because of the bottlenecked port
Bottleneck configuration
• Configure bottleneck parameters
Bottleneck statistics
Topology indications
Show affected hosts
THANK YOU
67

similar documents