Building Reliable, Secure and Manageable Substation

Reliable, Secure and Manageable
Substation Communications
Dragan Dokic | CCIE, CISSP, MCSE
Introduction - Experience
• Dragan Dokic | President, Summit Energy Tech
• Focus on utility sector
– Infrastructure systems management
– Custom business systems software development
• 16 years of experience in IT industry
• 10 years in utility sector
– Managed network operations for PNGC Power
[Portland, OR] from September 2002 to October 2011
– Presentation focuses on lessons learned in field
network reliability, security and manageability from
this experience
• PNGC’s 2001 – 2011 field network
– 92 office, substation and repeater sites at 11
distribution utilities in Oregon, Idaho
• System mission
– Gather real-time load data 24/7 for power
scheduling operation in Portland
– Support local utility SCADA/AMI/Site Security
PNGC Power WAN – July 2011
Toledo, OR
Boardman, Oregon
Junction City, Oregon
Lewiston, ID
Malta, ID
The Moon
Areas of Focus
Presentation available for download at
in the Events section
Reliability – Network Design
• Keys to success
– Diversity in media
• Combine land lines, fixed wireless [private/public], mobile wireless and
– Diversity in providers
• Local and national
– Dynamic Routing [OSPF]
• Routers exchange knowledge of local network with neighboring routers
• Enterprise grade routers / switches a requirement
• Perfect world configuration
– Private wired/wireless ‘island’ with two Internet gateways
using distinct media and distinct providers
Link cost
Link cost calculation
Sub A -> Main Office via
Satellite tunnel:
Link cost calculation
Sub A -> Main Office via
900Mhz+DSL tunnel:
Open Shortest Path
Link cost via Satellite
tunnel [4] higher than via
DSL tunnel[3]; therefore,
packets will traverse
900Mhz/DSL tunnel in
normal operation
Normal Operation
Open Shortest Path
From substation A to
Main Office
Normal Operation
Open Shortest Path
From substation B to
Main Office
Link down operation
If DSL tunnel is down,
packets will traverse
satellite tunnel;
Sub A  Main Office
Link down operation
If DSL tunnel is down,
packets will traverse
satellite tunnel;
Sub B  Main Office
Security – Overview
• Wireless link encryption
• Function specific VLANs
• No default routes!
Wireless Link Encryption
• Media device level [e.g. Radio, Modem]
• Routing device level [e.g. Cisco 891 router]
• End device level [e.g. DIGI TS4 port server]
At what level to secure data?
Security - Wireless Link Encryption
• Most secure option?
– Use all three if management overhead is not an issue
• Most efficient but secure enough option?
– Use routing device site-to-site VPN capabilities
– Advantages:
• Support for best commercially available security
technologies [e.g., AES-256]
• Comprehensive change logging capabilities
• Standardized configuration throughout the system [less
management overhead]
Security – Function Specific VLANs
• Define VLAN’s per business function
– SCADA, AMI, Security System, Wireless, VOIP, Network Mgmt.
• Firewall traffic between VLANs on need-to-access basis
– E.g., Prevent personnel attached to substation wireless VLAN to
access documentation stored on a server at the main office
from accessing recloser controls in the SCADA VLAN
• Reliability advantages
– Non-critical VLANs [e.g. AMI, security] can be shut down
automatically/remotely if link quality is too poor to carry all
traffic, but good enough to carry SCADA
One VLAN per
High-speed link outage
Security – No Default Route!
• Do not use default routes through service providersupplied gateways
• Define a single host route back to the main office, then
establish default route through VPN tunnel
• This is the most effective method to prevent attacks
sourced from the Internet
• Always use in conjunction to regular firewall configuration
lists [not a substitute!]
Less secure
More secure
Manageability - Overview
• Tools – network management systems
• Addressing – developing a scheme
• Watchdog system – preventing lockout
Manageability – Tools
• Network Management Systems [NMS]
• Protocols used
• SNMP, Syslog, ICMP, HTTP
• Applications
• Solarwinds Syslog
Manageability – Tools
• How to collect data? Push vs. Pull
– Pull: Poll devices using SNMP/HTTP/ICMP at regular
intervals [e.g., every
– Push: Devices send data per defined event triggers
– SNMP traps
– Syslog messages
• What data to collect?
Availability [ping]
Network utilization
Input voltages
RSSI [radio link quality]
Manageability – Tools
• Pull example:
– 5 minute SNMP poll of UPS for input voltage
– If voltage drops below threshold of 108VAC for a duration
of time longer than 5 minutes, an alert will be triggered by
NMS [e-mail, text message, event log]
– But what if voltage drops for 2 minutes only in between
polls? You may not know it even happened.
• Push comes to rescue:
– UPS sends SNMP trap to NMS as soon as voltage drops
below 108VAC
– Alert is triggered by NMS when trap is received
Paessler PRTG – Screen shot
Solarwinds Kiwi Syslog – Screen shot
Manageability – Addressing
• Develop consistent scheme to use system wide
• Recommended private range:
First octet: same for entire system
Second octet: site ID [e.g. 8=Springfield Sub]
Third octet: business function ID [e.g., 4=AMI]
Fourth octet: device itself [e.g., Collector #1]
Subnet Mask
1st octet ‘fixed’
2nd octet = site ID
4th octet = device
3rd octet =
Manageability – Addressing
• Large network?
– Group sites by region using second octet
– Allows for address summarization if needed.
• Example:
– Eastern division region:
• 10.64-127.0.0
• Summary address:
– Western division region:
• 10.128-191.0.0
• Summary address:
Manageability – Watchdog System
• General concept
– Reboot key remote communications devices if
connectivity to central site is interrupted
• Benefit
– Prevent unnecessary site visits due to
• Operator error
• Device lock-up [e.g., buggy firmware, heat issues]
Manageability – Watchdog System
• Hardware requirements:
– SNMP-capable switched PDU with task scheduling and
delayed power cycling command capabilities
– Example: APC AP7900 8-port 15A PDU
• Software capability requirements:
– Centralized command override mechanism using NMS
– Send SNMP ‘Set’ to cancel pending power cycling
Manageability – Watchdog System
• ‘Delayed’ power cycle schedule is defined on PDU:
– Outlets to power cycle:
– Frequency:
– Command execute delay:
1,2 [e.g., radio, router]
60 minutes
30 minutes
• Network management system running at main office sends an
SNMP delayed power-cycle command cancel message
– Frequency:
every 5 minutes
• Process
– If delayed power cycle cancel command cannot reach the PDU at least
one time during the 30 minute reboot delay period, outlets 1 and 2
will be power cycled and communication will (hopefully!) be restored
Thank you!

similar documents