Case Study: Smart Grid Vendor Boosts Network Intelligence

IT can begin to do more with less once it gets out in front of problems. Here's how one firm became proactive about its network.

by Joe Zwers

Billions of dollars are being spent on creating smart electrical grids around the world, with the promise of greater efficiency and the opportunity to delay building new power plants. Pacific Gas & Electric, a utility serving northern and central California, alone is spending $2.2 billion. It has already installed 4 million smart meters with plans for another six million over the next three years.

One of the companies that stands to benefit from the transition to smart grids is Itron Inc., a firm headquartered in Liberty Lake, Washington that manufactures smart meters for electric, gas and water utilities, and provides the hardware and software needed to remotely read and manage the client accounts. But, while Itron provides utilities with their customers' hourly energy consumption, it couldn't view traffic on its own corporate network, an issue exacerbated by the 2007 acquisition of its largest competitor, Actaris, which tripled its size in the process.

"We had no visibility into anything happening on the network," says Rob Routt, Itron's network engineer. "We were spending a lot of time identifying and troubleshooting problems, but didn't have any proactive way to figure out what going on."

But, just as utilities use Itron's software to gather usage data from customers' meters, Itron solved its dilemma by gathering NetFlow data from its Cisco routers, thereby gaining the visibility it lacked.

Managing the Merger

Itron, which has been making utility meters for thirty years, is primarily a Microsoft shop. Some of the 400+ servers in its main data center in Liberty Lake run AIX or Solaris, but the UNIX servers have been migrating to Linux. The rest run on Windows. This extends to its client software. The Itron Enterprise Edition Meter Data Management (IEE MDM), the software that collects data from smart meters, handheld wireless meter readers and other collection systems and does the customer billing, performs analytics and outage management is built using Microsoft's ASP.NET, Internet Information Services, .NET Framework, Windows Communication Foundation and Windows Presentation Foundation. The IEE MDM can use wither Microsoft's SQL Server or an Oracle database, both of which have been tested to scale up to around nine million smart meters with hourly pricing.

Initially, when Itron acquired Actaris, the two companies operated independently, but their activities are gradually getting tighter integration. In April 2009, on the second anniversary of the acquisition, the company announced that Actaris would start using the Itron name. There is still considerable work to be done, however to bring about a unified computing structure. While Itron had its own main data center, with a backup at its facility in South Carolina, Actaris had been outsourcing its IT operations. So, while Itron's LANs and WANs were all monitored and managed out of Liberty Lake, the Actaris networks were managed by service providers.

"Management decided not to fold them into our WAN infrastructure," says Routt. "They are still running on a separate WAN, and we just decided to link to them via a gateway."

To obtain greater visibility into both networks, Routt deployed the WebNM enterprise network management suite, part of which he had used at a prior job.

"When you start looking at products like OpenView or Tivoli, you have to have a whole staff of guys to run it, in addition to the hundreds of thousands in software costs," he says. "It made sense to just try this and if it didn't work out move onto something else, but we found that it scales well for our environment."

Part of the WebNM suite by Plixer was a flow analysis tool called Scrutinizer that captures NetFlow and sFlow data from existing routers (Cisco, Enterasys, Extreme, Foundry, Juniper, Riverstone, Packeteer and dozens of other vendors) in order to show exactly what traffic is flowing through a network interface, including the source and destination of the packets, the application and the protocol being used. This gives a finer-grained look at network traffic than is available just from looking at SNMP data, which only shows the aggregate amount of traffic through the port.

"We started using Scrutinizer because we didn't have any way of determining who was doing what or what were our top applications," says Routt. "We had a lot of problems with overrunning bandwidth, and this gave us a lot more visibility and performance information about the network."

The installation and configuration came as part of the suite purchase price. A technician came out for two days to set up the software and train the Itron staff on its use. It was set up to gather NetFlow from all the routers in the company, about 65 in all.

"Usually we use Netflow data to look at the bandwidth on an interface and find out who is doing what or what application is using most of the bandwidth," he says. "If it is valid traffic, okay, but we have used it in the past to identify streaming video, and when it starts affecting the rest of the company, we have to tell people to stop doing that."

He has also used this flow analysis tool to identify problems related to valid traffic. For example, Itron uses McAfee ePolicy Orchestrator (ePO) central security management software. The software itself is not a problem, but there is a mismatch between the bandwidth at headquarters and some of the branch offices. Liberty Lake and four other offices have DS-3 connections (44 Mbps) but most of the other sites have only a single or multiple T-1 connections (1.544 Mbps) so the security updates can flood the connection.

"Because we have a DS-3 and most other sites don't, we are able to send data a lot faster than they can receive," says Routt. "We have gotten calls saying the network is slow. We open up Scrutinizer, see the connection is pegged at 100%, we look further and the main culprit is ePO."

Unmasking ASA

For network security, Itron is standardizing on Cisco's Adaptive Security Appliances (ASA). So far the company has deployed about a dozen in locations including Australia, Brazil and France.

"Anywhere we have external Internet connections or VPN connections, we are moving to standardize with the ASA platforms," says Routt.

One problem, however, with ASAs is that they can mask network problems. Initially they did provide SNMP information, which would tell when a connection was overloaded, however they did not show where all that traffic was coming from. Since the ASA appliances used Network Address Translation (NAT), when trying to analyze the packets, it appeared that all the traffic was coming from a single host – the firewall.

In early 2009, however, this changed. With the launch of the Cisco ASA 5580 products, Cisco also released a Flexible NetFlow security logging implementation called NetFlow Security Event Logging (NSEL), which is designed to scale better than syslog without sacrificing granularity. Flexible NetFlow - an extension of NetFlow v.9 - allows administrators to specify the fields they want to gather on the packet flows. It provides enhanced optimization, reduces costs and improves capacity planning and security detection beyond traditional flow technologies.

Using NSEL, the ASA 5580 tracks the state changes that a flow undergoes, including the flow creation, when the flow is denied by an ACL, and flow teardown. With NSEL, flow change events trigger the creation of a NetFlow record. The ASA can also send syslog messages, but since the NSEL packages contain the same information, administrators can shut off redundant syslog messages to save resources. The flows can also be filtered so only certain types of events are reported and, using Modular Policy Framework, NSEL records detailing particular types of events or types of traffic can be sent to separate collectors.

NSEL gives network and security administrators greater visibility into what is happening on the firewall, but only if the collector is capable of correctly reading and storing the Flexible NetFlow data. Most collectors today are only capable of reading data in a set pattern. This is not a problem with traditional NetFlow, since it uses a single template. NSEL, however, can send any of 17 different templates, depending on the type of event being reported. In addition, one may want to see other data such as MAC addresses or VLAN IDs, giving rise to even more templates.

Since Scrutinizer does support Flexible NetFlow, Routt was able to implement NSEL on his ASAs.

"Before we were blind," says Routt. "We couldn't see where connections were going or what they were doing. Now that the ASA exports NetFlow, we have enough visibility so we can track issues quickly."

Peering into the Black Holes

As the third anniversary of the acquisition approaches, the integration is still proceeding. There is an ongoing migration from an LDAP directory and Lotus Notes to Microsoft's Active Directory and Exchange Server. The original Actaris and Itron MPLS networks are tied together by a link between an Itron router and one at a managed service provider in Europe.

"Other than that, there is not a lot of network integration, so I am trying to get more visibility via SNMP and NetFlow," says Routt. "We are able to look at quite a bit of the provider network, but as far as local LANs, in most cases it is a black hole."

The visibility is improving as more devices are configured to report NetFlow data, and as IT staff at Itron and its European service providers become more adept at using flow analysis to track down problems.

Joe Zwers is a freelance writer based in Tujunga, California, specializing in business and technology.