After a century of relative stability, the current decade is seeing a dramatic pace of change in the generation, distribution, and consumption of electricity. Driven by technology and ambitious policy objectives, this rapid evolution has begun to stretch the electricity system in fundamental ways. In order to support the new capabilities (e.g. distributed generation, demand participation, etc.) being introduced, the electricity system needs a ubiquitous layer of information for situational awareness, coordination, and control. Developments in information technology have enabled the weaving of a “packet grid” that supports information flows independently of the “electron grid”—this combination is generally referred to as the smart grid.
As of today, most utilities still rely on legacy communication networks that were purpose-built for the support of individual applications. An integrated network architecture will be required to meet the evolving needs of the electricity system while maintaining reliability, security, and performance. In order to make effective decisions, power sector leaders will need a basic understanding of smart grid network technologies. This paper outlines some of the fundamentals of communication networks for the smart grid. The content comes from extensive ad hoc reading of Google queries, Wikipedia, and a book written by Budka, Deshpande, and Thottan from Alcatel-Lucent Bell Labs. For a more in-depth understanding, reading this book is strongly recommended.
Linking networked devices to grid applications
An integrated communication network for the electricity system is an essential backbone for many different grid applications. With the roll-out of a new generation of devices such as smart meters, sensors, and controllers, the amount of data that needs to be collected, transmitted, analyzed, and acted upon is growing exponentially. Some devices such as smart meters only need to communicate a few times per hour. Other devices, however, need to communicate more than 100 times per second and require nearly-instantaneous responses to their measurements.
Source: Alcatel-Lucent Bell Labs (modified)
Although modern technologies can easily handle smart grid requirements reliably and securely, most of the network communication system in place today were deployed by utilities decades ago. Furthermore, new infrastructure deployments are expensive and in any many cases the economic benefits will take some time to be realized. Nonetheless, smart grid applications promise to revolutionize the electricity system through better efficiency, reliability, resilience, and the integration of distributed and/or intermittent resources.
How do networks transfer data?
Communication networks are used to transfer information (data) between endpoints through an interconnected grouping of paths (links) and intersections (nodes), from a source to a destination (endpoints). For the most part, data flows between nodes in both directions. Each node is capable of receiving and forwarding data over links with other nodes or endpoints. Each link has a maximum data rate (bits per second), with nodes often storing data within their buffer capacity.
Generic communication network schematic
Source: Alcatel-Lucent Bell Labs
At the most basic level, data is encoded in a stream of “0”s and “1”s (bits)—a string of 8 bits is a byte. Bytes are assembled into “units” of data transfer called protocol data units (PDU). Once sent by an endpoint into a network, a PDU is commonly referred to as a packet. Generally, packets are composed of a header containing overhead information such as the destination address, a payload with the actual data being transferred, and sometimes a trailer containing additional overhead. The format and maximum size of each packet are specified by the network.
Protocol data unit (PDU)
Source: Alcatel-Lucent Bell Labs
Communication networks provide data transfer service between endpoints to support applications across locations. For example, an email server communicates with a computer to send and receive emails. If the amount of data transferred from the source to the destination endpoints exceeds the maximum packet size, the data message is divided (or packetized) into parts that the network can accommodate. The packets are then transferred to the intended destination endpoint and reassembled by following a set of rules.
Network protocols establish the rules of communication between endpoints and nodes in the network. These rules define the way endpoints are addressed, how packets are routed through the network, and data processes such as packetizing. Protocols are organized as a hierarchy of layers, each serving a distinct function and existing to serve the layer above it. Data is packetized from one layer to the next, with the payloads from each layer representing fragments that are reassembled into packets of the layer above.
Hierarchy of network layers
Source: Alcatel-Lucent Bell Labs
As per the Open System Interconnection (OSI) reference model, there are seven layers of network communication protocols. For each link between two nodes, communication happens at each respective layer. Multiple disparate media can be used for communication on successive links, in which case separate L1 and L2 modules are required to support link connections at each node.
|L7: Application layer||High-level APIs, including resource sharing, remote file access, directory services and virtual terminals|
|L6: Presentation layer||Translation of data between a networking service and an application; including character encoding, data compression and encryption/decryption|
|L5: Session layer||Managing communication sessions, i.e. continuous exchange of information in the form of multiple back-and-forth transmissions between two nodes|
|L4: Transport layer||Reliable transmission of data segments between points on a network, including segmentation, acknowledgement, and multiplexing|
|L3: Network layer||Structuring and managing a multi-node network, including addressing, routing, and traffic control|
|L2: (Data) Link layer||Reliable transmission of data frames between two nodes connected by a physical layer|
|L1: Physical layer||Transmission and reception of raw bit streams over a physical medium|
The L1 physical (PHY) layer is the physical communication medium used for the transfer of data as a stream of “0”s and “1”s. This medium can be a twisted pair of wires (e.g. phone line, CAT 5), coaxial cables (e.g. TV cable), optical fiber, power lines, or air (i.e. wireless). Bits are typically encoded using a modulated sinusoidal signal that is decoded by the receiving device. Transmission of data over optical fiber is done via lasers or light-emitting diodes (LED) intermittently injecting photons that travel through glass fibers over long distances at the speed of light. Power line communication (PLC) uses power lines themselves to transmit information. PLC has traditionally been limited to low data rates, however new standards are being developed to support better performance. Serial communication refers to a set of early legacy standards that are still used in many utility deployments for low data rate applications and transfer bits asynchronously a few at a time rather than maintaining a synchronized clock for sending and receiving of data. For wireless signals (e.g. 3G, LTE, RAN), the signal is usually a combination of many frequencies that occupies a specified bandwidth within the wireless communication spectrum.
The L2 link layer is used to establish a link between two nodes and reliably deliver a frame of data between the L2 entities at each node. Each L2 frame is a packet with a header, a payload, and a trailer that includes a checksum (CRC) for error detection. An acknowledgement procedure allows for frames to be retransmitted if the source fails to signal successful delivery within a specified time interval. The L2 layer also supports traffic control and sequencing for the transmission of data frames between nodes. Network service providers (NSP) have traditionally offered Frame Relay services allowing endpoints connected to Frame Relay service nodes to establish L2 virtual connections (VC) with each other. The NSP then manages the Frame Relay network and transmit data frames between service nodes on behalf of customers.
The L3 network layer controls the operation of packet transmission by assigning addresses to nodes and routing frames along physical paths. The L3 layer also groups clusters of nodes and endpoints into subnetworks that have a unique common address referred to as the subnet’s gateway. L3 traffic is exchanged between subnets and across networks through routers at each node that maintain routing tables which map the address of each destination endpoint and help direct packets to the next router along the path. To prevent packet non-deliveries due to router network link failures, L3 entities manage router buffers and periodically detect failures and modify routing tables according to routing protocols (e.g. Open Shortest Path First, or OSPF) that recalculate optimal paths through metrics such as minimum number of nodes (hops) or delays. Another important function of the network layer is to provide quality of service (QoS), which we describe later in its own section.
The L4 transport layer ensures that messages are delivered without errors, in the correct sequence, and with no duplications or losses. It relieves the higher layer protocols from any concern over the transfer of data across the network. Unlike lower-level connectionless protocols that only require a secure link between immediately adjacent nodes, the transport layer is often supported by a connection-oriented protocol that provides end-to-end connections between endpoints across the network. The L4 layer provides message segmentation for the higher layers by splitting large packets into smaller units that the network layer can handle and reassembling them at the destination. The transport layer also inserts control information in its packet headers for traffic sequencing, provides acknowledgements for reliable end-to-end message delivery, and enables the multiplexing of several message streams. The role of the L4 transport protocol depends on the type of service that is provided by the lower layers (e.g. VC, checksum). If the network layer unreliable, for example, the transport layer will need to include extensive error detection and recovery.
The L5 session layer supports the establishment, maintenance, and termination of sessions (i.e. connections) between two application processes running on different machines connected to endpoints across the network. The L5 layer functions allow application processes to communicate over the network through security clearances, name recognition, logging, etc.
The L6 presentation layer acts as a translator between the application layer and the network. It formats the data to be presented to the application layer and provides data conversion, data compression, and data encryption.
The L7 application layer serves as the network interface for users and applications. It contains a variety of common functions such as resource sharing, remote file access, directory services, electronic messaging (e.g. email), etc.
For communication between two endpoints, the L1 physical and the L7 application layer are always present. The combination of layers between L1 and L7, which is referred to as the protocol stack, can either span all layers or none at all—in which case the endpoints would be directly connected to each other over a single link. In many network configurations, the L5 session layer and L6 presentation layer are not present. Some protocols also span several layers or include certain features of a given layer while excluding others.
Common smart grid networking protocols
RS-232-C is a L1 standard for serial communication used in data transmission for legacy applications (e.g. power substation). It defines the signals connecting between data terminal equipment (e.g. computer terminal) and data communication equipment (e.g. modem). The standard defines the electrical characteristics, timing, and meaning of signals. Due to the large size of connectors and low data rates, RS-232-C has been replaced in computers by newer standards such as universal serial bus (USB). Nonetheless, the standard is still widely used in industrial and scientific applications.
Time division multiplexing (TDM) is a L1 protocol that allows a single physical medium to simultaneously transmit data on several channels by rapidly alternating between different streams of data and recomposing the data for each channel on the receiving end. The most basic unit of the TDM data rate (64 kbps, kilobits per second) is called digital signal zero (DS0). A collection of 24 DS0s is a transmission system one (T1) and a collection of 28 T1s is a T3, which has a data rate of 45 Mbps. In fiber optic networks, wave division multiplexing (WDM) is a L1 protocol that makes use of different colors to create simultaneous transmission channels, thus dramatically increasing the data transmission capacity of a single optical fiber.
Synchronous optical network (SONET) is a very high data rate (up to 40 Gbps) L1 TDM protocol originally developed for voice communication over optical fiber networks. Outside of North America, its equivalent name is Synchronous Data Hierarchy (SDH). SONET signaling and frame formats have also been implemented for communication over other media (e.g. wireless microwaves). The basic unit of communication for SONET is called synchronous transmission system one (STS1) with a data rate of 51.84 Mbps. The L2 protocol for SONET is typically packets over SONET/SDH (POS). POS employs the point-to-point protocol (PPP), a common L2 protocol that can simultaneously support multiple L3 network protocols but requires a duplex circuit (i.e. two separate, continuous, circular, side-by-side, one-way lines of communication) to operate.
Ethernet is a L2 protocol that is very common in local area networks (LAN) and has evolved to support higher data rates over longer distances such as metropolitan area networks (MAN) (e.g. through optical fiber). Typically, the Ethernet protocol is coupled with the media access control (MAC) protocol, a sublayer of the L2 link layer. The MAC sublayer acts as an interface with the L1 physical layer by emulating full-duplex (dual one-way) communication on half-duplex (single one-way) channels. In addition to receiving and transmitting frames of data, the MAC sublayer assigns addresses to each device connected on the network. MAC addresses, which are assigned to each device at the time of manufacture and thus globally unique, are only addressable within a subnetwork by the local router—a network layer is required to transmit data beyond the LAN. Ethernet networks are composed of Ethernet switches connected together in a spanning tree structure that ensures a loop-free topology. In order to transmit data, Ethernet uses a carrier sensing process whereas each switch (station) continuously monitors each of its connections for frames being transmitted through the broadcast domain. All stations in the domain receive all transmissions simultaneously (multiple access), however only the station whose MAC address matches that of a transmission retains the frame for processing or forwarding. When more than one station transmits a frame at the same time, collision detection occurs and each station reattempts its transmission after a random delay. Ethernet switches use MAC address learning to maintain a table mapping sources and destinations, thus eliminating the need for a broadcast when frames can be directly forwarded.
Internet Protocol (IP) is the most widely implemented L3 network layer protocol. Although the Internet is based on IP, the protocol is also used on networks unrelated to the Internet. IP was developed to connect any two endpoints that have at least one or more networks providing a data path between them. IP can operate independently of the underlying physical media, L1 layers, and L2 layers that make up the network interconnections. An IPv4 address is a combination of four 8-bit numbers separated by periods (a.b.c.d whereas 0≤a,b,c,d≤255), for a total of 232 (4.3 billion) addresses globally regulated by the Internet Assigned Numbers Authority (IANA). Every entity (node or endpoint) that is addressable in an IP network must have a unique IP address. IANA has reserved a subset of approx. 17 million addresses that can be used redundantly by devices within different subnetworks, as long as those devices do not connect directly to the Internet (i.e. without going through a router). Due to the rapid growth of networking over the past two decades, it became evident long ago that IPv4‘s 32-bit address space would rapidly become inadequate to support the growing demand for IP addresses—by the Internet-of-Things (IoT), for example. To address this concern, IPv6 was developed in the mid-1990s with 128-bit IP addresses that can support up to 2128 (3.4 × 1028) unique addresses. Although new network products have begun to support IPv6, deployed networks are taking a long time to migrate—it should be noted that IPv6 is backward-compatible with IPv4 addresses.
Transmission Control Protocol (TCP) is a L4 transport layer that is defined over IP and often referred to as TCP over IP (TCP/IP). TCP is a connection-oriented protocol that provides reliable, sequenced, and error-checked delivery of a stream of packets between application endpoints. TCP also handles the segmentation of messages into packets on behalf of the application layer. Due to network traffic and congestion, IP packets can be lost, duplicated, or delivered out of order. Under TCP, the destination endpoint provides acknowledgement for each packet successfully delivered by the source endpoint. The source endpoint keeps track of each packet sent; if a positive acknowledgement is not received within set time, the packet is retransmitted. TCP is optimized for accurate and reliable delivery of packets rather than timely delivery, thus incurring relatively long delays. Due to high overhead, TCP is referred to as a heavyweight protocol. Similar to TCP/IP, User Datagram Protocol (UDP) is a L4 transport layer that is defined over IP. However, UDP is has little overhead (lightweight) and does not guarantee the delivery of packets (connectionless). UDP is often used for application such as voice data that are highly time-sensitive and would not benefit from the retransmission of lost packets.
MPLS network schematic
Source: Alcatel-Lucent Bell Labs
Multiprotocol Label Switching (MPLS) is often considered a “layer 2.5” protocol, as it performs the functions of L1 and L2 while also including features that are typical of L3. MPLS services allow a single network to satisfy different types of traffic by emulating many L1 and L2 protocols including T1, PPP, Frame Relay, and Ethernet. The MPLS protocol does not have any sublayers of its own (i.e. L1 or L2) and is agnostic to the L1 and L2 layer protocols used for connectivity. An MPLS network is composed of three types of nodes: customer edge routers (CE) are outside endpoints connecting into the network, provider edge routers (PE) are MPLS network endpoints, and provider routers (P) are intermediate nodes for data transmission within the MPLS network. The MPLS packet header, which is inserted between the L2 header and the payload, designates a MPLS label for each packet. MPLS routers maintain routing tables for each label and forward packets along predetermined label-switched paths (LSP) according to their label mappings. All packets entering a given path follow the same sequence of routers to their next destination. There can be several different LSPs defined between two endpoints and each LSP is unidirectional—for two-way communication, two LSPs must be defined in opposite directions and each can go through a different set of routers. LSPs can be defined or removed from the network at any time according to the network’s label distribution protocol (LDP), which also determines network changes due to router or link failures. MPLS provides extended QoS functionality and supports guaranteed minimum data rates for given LSPs through the resource reservation protocol (RSVP). It is expected that MPLS will replace legacy L2 protocols such as Frame Relay.
Long-Term Evolution (LTE) is a fourth-generation (4G) wireless communication standard developed by 3GPP and commonly used in smartphones. LTE has been widely deployed and can transmit data at high data rates (up to 100 Mbps) with low latency (less than 100 milliseconds). The wireless antenna base station for an LTE network is called the evolved Node B (eNB) and establishes the connection with user equipment through the radio access network (RAN). LTE uses a L1 protocol called orthogonal frequency-division multiplexing (OFDM) to enable simultaneous two-way communication through frequency-division duplexing (FDD), whereas each mobile device’s uplink (UL) transmitter and downlink (DL) receiver operate at different carrier frequencies. LTE also supports time-division duplexing (TDD), whereas the UL and DL are separated by the allocation of different time slots within the same frequency band. LTE also employs multiple input and multiple output (MIMO) to multiply the capacity of individual radio links by using multiple transmit and receive antennas to exploit multi-path propagation of signals. At the L2 link layer, LTE has a MAC sublayer that performs channel mapping, data handling, and ciphering to prevent the unauthorized acquisition of data. In addition, the LTE L2 also includes a radio link control (RLC) sublayer that performs packet segmentation and reassembly, transfer acknowledgement, and flow control between entities. At the L3 layer, LTE supports an end-to-end IP connection and the eNB base station itself is an IP node. The eNB is the bridge between user devices and evolved packet core (EPC), which serves as the gateway between the wireless network and the broader IP network. The EPC handles packet routing and forwarding, IP address allocation, access equipment authentication, QoS, and more.
Radio frequency mesh (RF-mesh) networks are ad hoc communication networks composed of radio nodes (i.e. wireless routers) organized in a mesh topology. For smart grid applications (i.e. smart meters), the predominant RF-mesh standard is Zigbee, which was developed and specified by an alliance of smart grid companies. RF-mesh networks are supported at L1 and L2 by the IEEE 802.15.4 standard for low data rate wireless networks. Zigbee enhances the 802.15.4 standard by adding L3 networking and security functions required for smart grid applications. RF-mesh works by building a multihop network that dynamically establishes connections between neighboring nodes. When a node connects to the mesh network, it begins to exchange data frames with the other nodes in the network over an air interface and the mesh protocol routes each message to its destination node. Due to constraints in transmission power or physical obstacles, the RF range between may be limited—to support a larger and more reliable network, stand-alone data forwarders (DF) can be deployed to extend RF range and reduce the number of “hops” required. DFs are more effective when mounted at a proper height (e.g. a power pole) to maintain a clearer line of sight. Similarly to Ethernet communication, a logical mesh must be created so that each node receives data frame broadcasts according to the spanning tree. Each node either retains the frame (if it is addressed to itself) or forwards it toward its destination. Because RF-mesh networks communicate wirelessly over the unlicensed spectrum, other wireless traffic may cause interference with the RF-mesh. Such interference can be mitigated by using frequency hopping spread spectrum technology that spans multiple channels. The useable data rate for RF-mesh depends on the data rate supported by the radio, the number of hops between a node and the destination, the protocols used over the radio broadcast, and packet overheads.
Substation: SCADA, DNP3, and teleprotection
Supervisory Control and Data Acquisition (SCADA) systems have been used by utilities since the 1960s to proactively monitor and control grid operations. The SCADA master control is typically located in the utility’s centralized data and control center (DCC) and connected through the communication network to remote terminal units (RTU) deployed within each of the transmission and distribution substations. At each substation, one or more human–machine interface (HMI) and engineering workstations may be installed for local access to substation functions such as manual control and device configuration.
Primary equipment inside the substation enables SCADA to control substation operations in real-time. To protect the circuit from short circuits, ground faults, or other anomalies, there is switchgear in place to trip the circuit in case of failures. Switches are operated manually, circuit breakers (CB) trip automatically when there is a failure, and reclosers have the capability to switch back on without manual intervention. Other primary equipment includes current transformers (CT), voltage transformers (VT), and voltage regulators.
Secondary equipment supports control functions and reports the measurement of voltages (V), currents (A), power (W), and reactive power (VAR), as well as the status of various substation systems. Bay controllers and relays (low-power devices) receive control signals from the DCC through the RTU and actuate substation elements. The RTU also collects measurements, alarms, and other information from secondary equipment and forwards it to the DCC. Because of the large number of devices connected through individual serial connections, the result is a complex pair-wise copper wiring mesh.
Legacy substation schematic
Source: Alcatel-Lucent Bell Labs
Recently, utilities have begun modernization their SCADA systems with substation automation by deploying microprocessor-based intelligent electronic devices (IED) to replace conventional equipment such as CTs, VTs, RTUs, bay controllers, and relays. A single IED may support functions formerly supported by multiple conventional devices in the substation, thus reducing the required number of devices and interconnections. These modern devices introduce new substation functions, simplify operations, improve performance, support newer communication protocols, and reduce costs. Utilities are also implementing new transmission management system (TMS) and distribution management system (DMS) applications that take advantage of the improved monitoring and control functions of the IEDs.
IEC 61850 is a comprehensive set of standards for utility substation systems specifying object models that characterize substation equipment and communication. As a basis for multivendor interoperability, a configuration language is defined to allow standards-based tools for SCADA operations and maintenance functions. With IEC 61850, all substation devices are based on IEDs that support one or more functions including switchgear, CT, VT, bay controller, and relay. According to the number and location of switchgear and transformers in the substation, one or more Ethernet LAN-based process busses are deployed to support local interconnections of IEDs. The station bus provides connections between IEDs and other systems in the substation (e.g. HMI)—process busses connect into the station bus, which is also an Ethernet LAN. The station bus connects to the SCADA master controller in the DCC through the IP router connected to the station LAN. An important aspect of IEC 61850 is the definition of the generic substation events (GSE), which provide a fast (within 4 ms) and reliable mechanism for generating event notifications within a substation.
Modern substation schematic
Source: Alcatel-Lucent Bell Labs
Distributed Network Protocol 3 (DNP3) is becoming the most prevalent SCADA communication protocol and replacing many traditional protocols. DNP3, which is still evolving, was developed by the DNP3 user group, an organization with members representing utilities and SCADA product vendors. Despite the word “distributed”, DNP3 is used by utility SCADA systems for both transmission and distribution, as well as in other industries such as water to gas supply. With the evolution of IEC 61850, rather than going through the RTU each IED can communicate directly with the SCADA master over DNP3 through the IP network. During transition to IEDs, serial connections are tunneled using IP—once IEDs are fully deployed and all devices are communicating directly with the DCC over DNP3, the RTUs can be removed. Connectivity to conventional equipment may be maintained for a period of time by connecting it to IEDs that support those devices. DNP3 provides for periodic polling of substation devices (typically every 2-5 seconds) by the SCADA master control for measurement data (e.g. voltage, frequency, accumulated kWh), status information (e.g. switch on/off), and special-purpose data (e.g. temperature, wind speed). In addition to responding to data requests, IEDs also asynchronously send information on substation events (e.g. circuit failure) as they occur. Based on data received from the substation and other available information, the SCADA master sends control signals to IEDs/RTUs (e.g. disconnect a switch, change taps on a voltage regulator). To ensure that time stamps are accurate, DNP3 supports synchronization of IED/RTU clocks.
The DNP3 protocol contains its own application, transport, and data link layers—for the perspective of the network, together these layers form an application layer riding over TCP or UDP. DNP3’s application layer breaks messages into fragments, the transport layer breaks fragments into packets, and the link layer adds its header on each packet to form a frame. To be compatible with L1 layers such as RS-232-C, DNP3 is defined over serial physical layer connections that can be emulated (i.e. with MPLS) and has an end-to-end data link layer. DNP3 uses the substation’s existing IP network (defined over the station bus’ Ethernet LAN) and features a L4 data connection management layer that allows it to run over TCP/IP or UDP/IP. The substation’s IP router communicates with the DCC through the IP network.
Teleprotection occurs when protection relays at different substations need to communicate with each other to locate faults in the circuit and disconnect faulty transmission lines by tripping a circuit break in either or both substations. For example, the distance relay at substation A may detect that there is a fault based on CT/VT instrumentation and send a permissive signal through the communication line to the relay in substation B. If the relay at substation A also receives a reciprocal signal from substation B, the transmission line is tripped at substation A. Within each substation, the relay that reports the fault may be different from the relay that sends the trip signal to the circuit breaker. Because faults in high-voltages lines can lead to severe power outages and danger, they must be cleared quickly and reliably. Teleprotection has very stringent delay requirements of less than 8ms for relay communication between substations and less than 4ms within each substation. The required bit error rate of less than 10-6 (one in one million) often necessitates multiple communication paths between substations. Given the stringent delay requirements, protection equipment for two adjoining substations is usually directly connected over SONET/SDH, Ethernet, wireless microwave, or PLC.
Other substation applications include closed-circuit television (CCTV) and mobile workforce communication (MWC).
Transmission: WASA&C, FACTS, and DLR
Wide area situational awareness and control (WASA&C) refers to the near real-time monitoring and control of transmission system operations across interconnected utility grids and over large geographical areas. Until recently, utilities have mostly relied on their respective transmission management systems (TMS) and SCADA to monitor and control their respective power grids. In comparison with TMS and SCADA, WASA&C is dramatically more granular (60x per second versus once every 2-5 seconds), gathers more measurements (phase angle rather than just voltage or current), monitors across many utilities, and has synchronized time stamps. In order to measure phasor values (amplitude and phase) for voltage and current across the network at extremely high frequencies (up to 120x per second for 60Hz lines), WASA&C employs sophisticated IEDs called phasor measurement units (PMU). All PMU measurements are time-stamped using a clock that is synchronized to the global positioning system (GPS). With synchronized time-stamping, PMUs are called synchrophasors.
Following the blackout of 2003, the Department of Energy (DOE) came together with North American utilities and regulators to define a specification for the North American Synchrophasor Initiative network (NASPInet). Still today, instead of using it for wide area grid control, utilities are simply forwarding PMU data to their respective independent system operators (ISO) and regional transmission organizations (RTO). NASPInet aims to create a network infrastructure for secure, reliable, and high-performance communication between synchrophasors and WASA&C applications deployed at utility DCCs, ISOs, and RTOs. The WASA&C applications at a utility DCC will eventually need to process data received from many thousand PMUs, including its own and those deployed by other utilities. Phasor data concentrators (PDC) are deployed at each substation containing PMUs and consolidate PMU data as needed—historical PMU data is stored in archives. At its core, NASPInet is a data bus that provides the secure, reliable, high-performance communication network and centralized services necessary for PMU traffic across utilities and monitoring centers. Integral to NASPInet are phasor gateways, which connect utility and monitoring center networks to the data bus. For secure data communication, a decentralized information-sharing network architecture called SeDAX is being considered for the data bus implementation. Performance requirements for NASPInet are categorized from class A through E, in descending order. Class A, which is used for wide area voltage and reactive power control, requires very high data rates (up to 120x per second for 60Hz lines), very high reliability (99.9999%, or unavailable less than 32 seconds per year), and very high latencies (less than 50 ms delay). Class C, in contrast, can tolerate less than 30 measurements per second with 99.99% (53 minutes per year) reliability and network delay of 1 second.
Source: Alcatel-Lucent Bell Labs
Flexible AC transmission systems (FACTS) are used to regulate reactive power (VAR) by controlling the reactance of capacitors along transmission lines with thyristors, which are semiconductor devices that act as “valves” for capacitors. FACTS allows capacitors to dampen voltage and power transients by dynamically changing their reactance. IEDs deployed in transmission substations to support the FACTS also provide monitoring and control functions that improve transmission power flow.
Dynamic line rating (DLR) allows transmission lines to increase their capacity by monitoring environmental conditions using IEDs deployed at transmission towers. Because heat directly affects line resistance and transmission losses, the actual capacity of transmission lines at any given moment is sensitive to environmental conditions. Generally, transmission line capacity ratings (i.e. maximum current) are based on the worst possible conditions. By using IEDs to measure environmental factors in real-time (e.g. ambient temperature, wind, solar radiation, ice accumulation, etc.), DLR has the potential to provide an additional 10–15 % transmission capacity 95 % of the time and 20–25 % more transmission capacity 85 % of the time—without costly investments in transmission upgrades. The parameters monitored by DLR IEDs also help compute power line sag to improve reliability and safety.
Given the significant number of IEDs that need to be monitored over large geographies, wireless networks with wide area coverage are the most appropriate communication medium for IEDs deployed at transmission towers. Intermediate data concentrators may be used at substations to connect with the centralized network.
Distribution: DMS, DA, VVWC, and AMI
Distribution management systems (DMS) are collections of applications that act as a decision support system to assist the DCC and field personnel with the monitoring and control of the distribution system. The main purpose of a DMS is to improve reliability and quality of service by regulating voltage frequencies and magnitudes, reducing outages, etc. By access various data sources such as SCADA and OMS, DMSs integrate real-time information on a single console at the DCC. Outage management systems (OMS) complement manual customer reports of power outages with automated applications such as customer information systems (CIS) and geographical information systems (GIS) to rapidly and accurately detect outages due to extreme weather, technical failures, human error, or intrusion. Distribution operation modeling and analysis (DOMA) monitors real-time distribution system flows and simulates future scenarios (i.e. look-ahead and “what if”) to support grid operators with analysis for decision-making.
Distribution automation (DA) refers to the acquisition of measurement data and control of feeder devices through IEDs connected to those devices. DA extends IEC 61850 substation automation to the automation of feeder devices. For DA, each feeder device must be connected to an IED to support measurement, monitoring, and control functions. Depending on available the available communication technology, a DA data concentrator can be deployed at the substation to collect IED data for the DA master and relay commands and polls back to feeder device IEDs. Reclosers are circuit breakers that monitor the feeder and are triggered when the current exceeds a certain threshold. Shortly after being triggered, reclosers automatically attempt to reconnect the circuit several times before concluding that the fault is permanent, after which they have to be operated manually. Switches are used to manually sectionalize (on-site or via executed commands) faulty feeder sections to divert power until repairs are completed. Capacitor banks are used to dynamically control reactive power (VAR) and maintain the power factor (PF) as close to 1 as possible. For dynamic VAR control, real-time electric measurements are used to rapidly connect or disconnect portions of capacitor banks on the feeder. Deploying capacitors close to inductive loads is more effective for reducing VAR than centralizing them in substations. Distribution transformer loads and other measurements (e.g. internal temperature), although typically not monitored but rather estimated from customer meter data, are becoming increasingly important to the efficiency (e.g. proper asset sizing) and reliability (e.g. failure predictions) of the distribution system. Synchrophasors deployed along distribution feeders are becoming increasingly valuable for better state estimation and accurate monitoring of power quality (consistent sinusoid of voltage frequency and elimination of harmonics) in distribution systems, particularly where there are large-scale installations of distributed generation (DG) such as solar photovoltaic.
Distribution automation schematic
Source: Alcatel-Lucent Bell Labs
The Volt, VAR, Watt Control (VVWC) function ensures that various electric quantities remain within acceptable operational ranges by regulating voltage (V), adjusting reactive power (VAR), and controlling the power (W) delivered through the grid. VVWC functions are required for both transmission and distribution systems and may be integrated between the two. At the distribution level, VVWC functions coordinate with demand response (DR) to control power (e.g. reducing demand through voltage control). The function also controls capacitor banks in substations and distributed energy resources (DER) for generation and storage. VVWC may collect data from various sources including SCADA, DA, DER, and AMI. Upon processing, the VVWC function sends control messages to IEDs at distribution substations, feeder systems, and DERs.
Advanced metering infrastructure (AMI) refers to the network infrastructure that connects smart meters deployed on customer property, including intermediate network elements supporting connection with the meter data management system (MDMS) located at the DCC. According to July 2014 estimates from EPRI, there were 50 million smarts meters deployed across the United States, covering 46% of U.S. households. A large number of utility functions are supported by AMI measurements, including operational functions such as automated meter reading (AMR), demand forecasting, DA, VVWC, and outage management, as well as business functions such as customer billing and revenue protection.
Smart meters provide periodic interval measurements, typically once every 15 or 60 minutes, as well as threshold alarms when a measurement (e.g. voltage) exceeds or falls below a pre-set value. Measurements for 3-phase meters are provide per phase, per line, and for the entire 3-phase connection. For high-voltage lines, meter connections may require CTs and VTs. Customers with on-site DG can either report the net energy flow with a single meter or use two separate meters for power consumption and production. In addition to measurements, AMI supports remote meter maintenance functions such as disconnection and reconnection, registration with the MDMS (e.g. after an outage), and firmware updates.
Smart meter measurements
|Average reactive power||VAR||Average|
|Reactive power consumption||VARh||Cumulative|
|Power factor (VAR)|||cos φ|||Instantaneous|
|Phase angle (φ degrees)||sin(φ)||Instantaneous|
|Pricing mode||TOU/CPP/RTP||Selection set|
Meter data management systems (MDMS) import, validate, cleanse, and process the large quantities of data delivered by smart meters, provide long-term data storage, and support customer billing and analysis. AMI data is transmitted from smart meters through their communication interfaces (either integrated or attached to meters) and carried through the neighborhood area network (NAN), which is most commonly based on RF-mesh over the unlicensed wireless spectrum or PLC through the distribution transformer (see above sections on layers and standards). Meter data concentrators are used to support communication between clusters of meters and the MDMS in the DCC, either over RF-mesh (up to several thousand meters) or PLC (limited within the secondary of the distribution transformer). Meters and the MDMS communicate over an end-to-end IP connection and each meter is IP-addressable by the MDMS.
AMI network schematic
Source: Alcatel-Lucent Bell Labs (modified)
When AMI solutions are vendor-proprietary, a product-specific meter management system called the head end communicates with meter data concentrators over the IP network using a standard protocol such as extensible markup language (XML)—in such cases, the MDMS cannot communicate directly with meters and must send AMI commands through the head end. Although most AMI deployments to date have been based on proprietary solutions, interoperability standards are being developed that support direct meter-MDMS connections and are expected to make it more prevalent. For example, AMI solution vendors are already adopting the ANSI C12.22 standard, which defines the L7 application layer for meter-MDMS communication and supports different network configurations for L1 through L4. In addition, the ANSI C12.19 standard provides common tables (data structures) for transferring data between meters and the MDMS.
Behind-the-meter: HAN, HEMS, and DER
Home area networks (HAN) are LANs that facilitate communication among digital devices present inside or within the close vicinity of a home. The ability for smart devices to interact enables functions for home automation that improve quality of life, enhance home security, and increase energy efficiency. Within the home, HANs use a combination of twisted wires (e.g. phone line, CAT 5), coaxial cables, Wi-Fi LAN, fiber optics, RF-mesh (e.g. Zigbee), and PLC through home electrical wiring (e.g. HomePlug). Outside the home, HANs connect to the utility communication network through the Internet via the home Wi-Fi network and PLC or RF-mesh via the smart meter.
Home area network schematic
Source: Alcatel-Lucent Bell Labs (modified)
Home energy management systems (HEMS) connect to the HAN and manage the home’s energy consumption (e.g. lights, appliances, HVAC), storage (e.g. EVs, storage batteries), and generation (e.g. solar, fossil-powered backup). Although newer appliances may already be equipped with integrated control and communication functions, existing appliances within a home (e.g. thermostats, lights, washer/dryer) may need external monitoring and control devices for measurements, on/off control, and other functions such as price-responsiveness. The HEMS monitors energy consumption/supply and may control the operation appliances based on user settings or energy management services from the utility company or other third parties via the utility EMS (UEMS), either through the smart meter or via the Internet. The HEMS may also provide smartphone access to devices connected to the HAN for user monitoring and control from outside the home.
Distributed energy resources (DER) are sources of electricity that are connected to distribution feeders and located close to consumption loads. DERs include demand-side management (DSM) as well as electricity generation and storage. Although individually small, DERs can be aggregated into virtual power plants (VPP) to provide meaningful amounts of power necessary to meet regular demand. Distributed energy resource management systems (DERMS) are software application platforms that can be used to manage and coordinate DERs.
Automated demand response (ADR) uses the HEMS to automate demand response (DR) via real-time communication with the UEMS. When the system operator needs to dispatch DR in response to a scarcity condition, the UEMS sends a control signal to the HEMS—either in the form of a higher real-time price or a quantity of curtailment. The HEMS then controls net consumption by shutting down appliances (or reducing their consumption) and using available electricity generation/storage resources. If possible, the HEMS may also reschedule consumption by some appliances to future periods (i.e. off-peak). OpenADR standards have been developed for communication based on IP connectivity (i.e. via the Internet) between customers’ HEMS and the UEMS of utilities, ISOs, or third party energy service providers. Alternatively, ADR signals may be sent through the smart meter via the NAN (i.e. RF-mesh or PLC). Due to security concerns, utilities tend to favor IP connections augmented with a security apparatus that limits HEMS access to only the required UEMS systems.
Distributed generation (DG) is customer-sited electricity generation equipment (e.g. solar PV, fuel cells) that injects its excess power into the distribution grid. Although some DG may directly produce AC power (e.g. wind, fossil-based), sources that produce DC output (e.g. solar PV, batteries) need DC-to-AC inverters to convert their production into an AC sinusoid that is matched and synchronized (amplitude, frequency, and phase) with the distribution system. For safety and grid stability, DG sources that are more than a few hundred kW need remote monitoring and control to ensure that they stay within their operational limits for voltage (e.g. 15 %), frequency (e.g. within 0.5 Hz for 60 Hz system), voltage flicker, power factor (e.g. within 0.85 lagging and 0.85 leading), and harmonics. In the event of a short circuit or ground fault, circuit breakers are needed to trip the connection between the DG and the grid. Due to their intermittent nature, DG sources impose system balancing constraints on the grid that require ancillary services such as reactive power supply, voltage regulation, and frequency regulation—although DG sources can provide such services (i.e. through smart inverters), their intermittency makes them unreliable. When DG is unintentionally disconnected from the grid, due to safety and synchronization concerns unintentional islanding must be prevented by discontinuing power production—even just locally. In order to maintain the functions required for the safety and reliability of DG, feeder devices deployed at DG connection points (i.e. relays, circuit breakers, reclosers, CTs, VTs) need to be connected to IEDs that can communicate with the DMS.
Distributed storage (DS) refers to devices such as chemical batteries, flywheels, supercapacitors, and pumped hydro that can store electricity received from the grid (charge) and deliver stored electricity to the grid (discharge) when called upon. The performance of DS devices is based on AC-to-AC efficiency (ratio of energy discharged to energy received from the grid), real-time response (speed of adjustment to changes in load), power rating (power discharge capacity in Watts), and discharge time (maximum length of time the DS can discharge at its power rating until empty). Safety, performance, and reliability concerns for DS are similar to that of DG. Electric vehicles (EV) that plug into the grid are a form of DS that can use excess charge to manage peak demand through an electric vehicle service element (EVSE) that allows energy service providers to track EV battery charge/discharge from any power outlet and manage financial transactions accordingly.
Microgrids refer to collections of individual consumers within a building, campus, or community that are interconnected with at least one shared DG source. A microgrid forms an autonomous power system that is capable of voluntary islanding to independently provide a minimum level of service (i.e. lighting, security, elevators) during a utility grid power outage. Microgrid energy management systems (MEMS) are used to manage electricity operations within the microgrid as well as energy transactions and interconnection with the utility grid. All microgrid devices need to be able to communicate with the MEMS, generally through PLC or RF-mesh. The utility may want to deploy IEDs at the microgrid feeder interconnection for monitoring and control through the DMS.
Integration: core-edge architecture
The core-edge architecture is a smart grid communication network design developed by Alcatel-Lucent Bell Labs. The network design is application-centric (i.e. begins with the end in mind) and integrated end-to-end, which contrasts with the traditional utility practice of deploying purpose-built disparate networks. The architecture’s core network, which may interconnect hundreds of routers, is generally deployed in parts of the utility service area where DCCs, company headquarters, and business offices are located. Remote endpoints at the edge of the grid connect to the core network over separate connections. The core-edge network handles all communication between application endpoints and connects with entities from external networks such as wholesale energy markets, bulk generation, and third-party service providers.
Source: Alcatel-Lucent Bell Labs (modified)
Wide area network (WAN) refers to the core network that is the backbone for communication across the utility service area. WAN is composed of an interconnection of WAN routers (WR), whereas all network endpoints connect to WRs (whether directly or through intermediate nodes). Traffic between pairs of endpoints is router through the end-to-end IP connection by the respective WRs in the WAN. In cases where the network implementation cannot support application requirements (e.g. teleprotection cannot tolerate delays), direct connections between application endpoints may be allowed. In addition to providing connectivity with application endpoints, WRs provide traffic aggregation and route data over the WAN towards the destination endpoint. To ensure network reliability, there must be at least two separate physical paths between every pair of WRs. Interior routers (IR) help the implementation of path redundancy and shorter paths between pairs of WRs. While WRs and IRs are often conveniently located at existing facilities such as DCCs and substation, additional locations may be necessary depending on the overall network design. WANs can be deployed over fiber infrastructure (point-to-point Ethernet over SONET/SDH) and microwave infrastructure (usually in combination with fiber for WAN extension). Utilities can use existing fiber and microwave asset deployments for their WAN and, if necessary, either build additional assets or contract with NSPs for leased TDM lines or shared services such as Frame Relay, Metro Ethernet, Virtual Private LAN Service (VPLS), and Virtual Private Routed Network (VPRN).
Field area network (FAN) refers to the wireless and wireline connections that support communication between the WAN and remote endpoints or CRs. Cluster routers (CR) aggregate data locally for collocated endpoints, thus enabling one single FAN connection to connect multiple endpoints with the WAN. Endpoints in a FAN are intelligent devices that can be located at substations (e.g. CR, SCADA IEDs, phasor data concentrators, meter data concentrators, DA concentrators, CCTV), on distribution feeders (e.g. DA IEDs, DER IEDs, meter data concentrators), and at customer locations (e.g. AMI, HEMS). There are numerous FAN networking technologies available, including optical fiber (WDM), leased TDM lines, wireless LTE, PLC, Frame Relay, Metro Ethernet, and MPLS services, among others. As with WANs, FAN connections can either be utility-owned or NSP-provided.
Performance: latency and QoS
Because grid operations need control actions to be taken in a timely manner, each utility application has absolute requirements for the overall latency (delay) it can tolerate from the communication network. In teleprotection, for example, any delay longer than a few milliseconds between the moment when a fault is detected to the tripping of the circuit breaker is unacceptable. Other applications (e.g. AMI) have much more tolerance for higher latencies.
Delay and priority requirements for smart grid applications
|Application function||Delay allowance (ms, one-way)||QoS priority (max=0)|
|PMU measurements (class A)||20||12|
|SCADA and DA measurements||100||25|
|Critical AMI (e.g. VVWC)||250||40|
|DMS and TMS applications||250||65|
|Priority AMI (e.g. ADR, black start)||300||70|
|PMU (other than class A)||500||80|
|Normal AMI (meter readings)||1,000||85|
|Outage management system||1,000||90|
|Best effort (default)||2,000||100|
Quality of service (QoS) refers to the level of preferential treatment that the network gives to packets of certain priority applications over those of others. Smart grid applications that have stringent delay requirements (e.g. teleprotection) need to share network resources that were sized to support average data rates with many other applications. As a result, QoS is especially important during times when network links and routers face congestion due to heavy data traffic. When the network exceeds its capacity to transfer, store, and buffer data, low-QoS applications experience delivery delays and packet losses. QoS information is stored within the packet header, usually on the L3 network layer (e.g. IP, MPLS) in the form of a type of service (TOS) byte. According to their pre-configured per-hop behavior (PHB), network routers may provide preferential treatment for higher-QoS packets (based on the TOS byte) by forwarding them before other packets. Packets without a TOS byte are assigned to the best effort (default) class of service, which is lowest-priority and has no guarantee of network resources. MPLS provides extended QoS functionality and supports guaranteed minimum data rates for given LSPs through the resource reservation protocol (RSVP).
Reliability: network availability and ownership
Network availability (reliability) is dependent on the frequency of failures of the nodes and the links in the network, the redundancy of paths between endpoints on the network, and the self-healing capabilities of routing protocols. High end-to-end network reliability requires utility-grade equipment that is hardened to withstand extreme conditions. For wireless networks, communication over licensed spectrum is more reliable than unlicensed spectrum due to high interference from other users. Underground cabling is often more reliable than over-the-ground cabling, however underground faults take longer to repair. Ensuring that there are multiple physical paths (either parallel or separate) between endpoints improves reliability. For example, each endpoint or CR should have multiple FAN connections with the WAN. Routing protocols such as MPLS fast reroute (FRR) provide for fast recovery after failures by configuring and maintaining physically separate backup paths in routing tables across the network.
Network ownership of communication network infrastructure is an important consideration for reliability. Utility-owned networks benefit from guaranteed performance (e.g. latency requirements for teleprotection), more stringent availability requirements (most NSP applications require less than 99.99% uptime, compared with 99.9999% for smart grid), and regulatory compliance with standards such as the North American Electric Reliability Corporation’s (NERC) critical infrastructure protection (CIP) requirements. NSP networks, on the other hand, benefit from frequent technology upgrades (new capabilities), state-of-the-art expertise (best-practice personnel), and lower cost. The final solution is often a mix of utility-owned and NSP networks (e.g. for new applications), depending upon the willingness of NSPs to provide service-level agreements with penalties to ensure the performance, reliability, and security of critical applications.
Security: zones and network elements
In order to keep the smart grid safe from security threats, it is necessary to minimize the attack surface, increase the amount of effort required to compromise the network, and decrease the detection and response time. Safety protections are needed at the device level (e.g. partitioning of systems, recovery from attacks), system level (e.g. physical security, access control), organizational level (e.g. policies, mechanisms, procedures). In addition to separating the operational grid network from the utility business network, the security architecture for smart grid communication networks should be divided into multiple security zones (e.g. transmission, distribution SCADA, distribution non-SCADA, enterprise, external networks). Depending upon the criticality of applications within each of these zones, they are accordingly subjected to different levels of security requirement.
There are several types of network security elements in place to protect smart grid communication networks. Access control lists (ACL) in routers monitor IP headers in every packet to filter unwanted data traffic based on source endpoints, destination endpoints, and other criteria. Unified threat management (UTM) is a collection of network security products that includes functions such as deep packet inspection and behavior-based threat detection algorithms. UTM encompasses a range of stand-alone devices such as firewalls (FW), intrusion prevention systems (IPS), and intrusion detection systems (IDS). UTM devices are deployed at substations, DCCs, and other locations throughout the network. For additional security, data encryption protocols such as IP security (IPsec) and transport layer security (TLS) may be implemented on different layers between application endpoints. MPLS also facilitates network security by providing endpoints within the MPLS service with complete separation from endpoints that are not defined over the MPLS infrastructure. Due to the limitations of existing network security technologies, scalable secure transport protocol (SSTP) was developed as a secure end-to-end protocol for smart grid application security that is lightweight, agnostic to underlying protocols, and scalable.
Conclusion: looking towards the future
Over the past few years, smart grid deployments have received a boost through incentives and regulation from governments cross the world. In the United States, as of July 2014 there were 50 million smarts meters deployed across the United States, covering 46% of U.S. households. In addition to AMI, smart grid programs have also led to evolution of network technologies such as RF-mesh and PLC. Interoperability standards are increasingly the norm for solution vendors and wireless broadband NSP services are becoming more widely adopted for FANs.
MPLS networking provides utilities a secure way to seamlessly support legacy protocols over an integrated communication network, thus easing the transition towards modern architectures without discarding existing infrastructure investments. Broader QoS implementation can improve network performance while reducing the cost of network expansions.
After a century of relative stability, the current decade is seeing a dramatic pace of change in the generation, distribution, and consumption of electricity. Driven by technology and ambitious policy objectives, this rapid evolution has begun to stretch the electricity system in fundamental ways. Modernizing the communication network infrastructure is the first of many steps in addressing the challenges of this new age of electricity operations.