Proactive IT Management
PG&E Keeps Network's Power Flowing
The safe, dependable delivery of electric power -- nowhere isthe reliability of mission-critical IT applications more important. Unprocessed workorders mean energy restorations to hospitals may be delayed. Just as important are callcenter operations, back office, accounting and customer service applications.
All network managers know there's no such thing as 100 percent up-time. That's why JoeSoldevila, automation project manager at Pacific Gas and Electric Company (PG&E; SanFrancisco, Calif.) makes proactive IT management his top priority. To achieve high networkand application availability, Soldevila and a team of automation specialists are tightlyintegrating Boole & Babbage's MAX/Enterprise event management console with connectionsto HP OpenView's network management framework, Boole & Babbage's MainView OS/390performance and automation products, SYMON visual Communications System and a number ofother existing management tools.
Acting as the glue between them, MAX/Enterprise takes network event information fromthese disparate management applications, correlates alarms and in some cases inspectsproblems on a deeper level. Then it sends detailed alarm information to the appropriatetechnicians, help desk personnel or clients through phones, pagers, SYMON boards, or acustomized MAX/Enterprise client workstation view. "We've engineered a solution withMAX/E that allows us to access all of our alarm systems and automation products easilythrough a standardized mechanism," says Soldevila.
In this unique configuration, the Boole & Babbage event management console not onlyacts as a liaison between HP OpenView and other network management and automation tools,it also automates help tickets in Remedy's Action Request System (ARS). Most importantly,it alerts help desk and support staff about major problems before users light upswitchboards with pleas for help. "We have taken systems that spew thousands ofmessages a day and consolidated them into a single repository where each message isevaluated, prioritized and processed according to its impact on our business," headds.
Big ticket events are sent directly to four-foot SYMON boards hanging on highly visiblewalls in several help desk offices and key business units at ten critical remotelocations. The SYMON boards scroll event information in red letters when somethingsignificant occurs that affects the client. This way, support staff is ready with criticalinformation -- cause of problem, time-to-repair and exactly who will be impacted -- beforeusers even know an event has occurred. In many cases, this early warning system givesample reaction time to divert traffic or call end users to inform them of the problem.
When Soldevila began searching for an event management console two years ago, hefocused on several key business goals. First, he needed to consolidate a number ofdisparate systems and have the flexibility to deal with each alarm differently dependingon its impact to the organization. The solution also needed to give operations' minimalstaff the ability to empower end users, automate trouble tickets and be powerful and openenough to integrate with other solutions like HP OpenView. This was necessary to manage avariety of networking devices, integrate into other network and systems managementframeworks, significantly raise the level of service and view problems in broader terms ofapplication availability.
For example, mainframe monitoring, which previously sent out volumes of messages tooperators, now requires little human intervention. When it does, it combinesMAX/Enterprise with Boole's PhonePoint product to page or call the appropriate person fromPG&E's technical support staff, most of whom work remotely.
Most importantly, says Soldevila, the service level has improved significantly."We now provide customers with the level of availability and outage notification theyrequire to do their jobs more effectively. With this infrastructure in place, we have theability to -- and plan on -- migrating in other critical applications and any new systems,applications and network components that require this level of availability and automatednotification."
Network Specialist, Kathy Purvis says her department appreciates the clearer, earliernotification. "The system alerted us to a problem about to impact a large portion ofour users. MAX/E pointed us to a repairman working on the microwave network." Withthe help of a Boole professional services consultant, Purvis took a half-day to write acustom rule-set that correlates SNMP events through HP OpenView.
The automation project has moved forward in stages, with each phase bringing moresophistication and efficiency to PG&E's mainframe operations, network management andhelp desk departments. The next step is to take the alarm information and the automationtools and deliver them directly to those who need it most -- end users, remote support andremote management personnel.
Some of this has already been accomplished with SYMON boards placed in key businessunits. Soldevila hopes to expand PhonePoint's dial-in self help to give users the abilityto perform some common help desk requests, like resetting passwords and IDs, checkingsystem status and triggering automated events over the phone.
Before the integration, PG&E's help desks ran like so many others in largecorporations. During system, application and network outages, they operated in crisismode. Help desk staff was flooded with calls for assistance, but they had no way ofpinpointing problems or estimating fix times. Often, they were so overworked they couldn'teven return calls to users to let them know when the system would be up again.
High And Dry
"[Automating helps] to keep costs down and our critical systems up. It keeps [us]from growing our help desk staff every time our responsibility increases. So far, MAX/Ehas integrated smoothly into any environment we've wanted it to," says Soldevila. Theimplementation of MAX/Enterprise has helped PG&E accomplish all of their IT goalsincluding user "self-help" applications. Now, working from one central office inFairfield, second-tier support/help desk staff is no longer flooded with calls. "Ifwe cannot provide a level of service adequate to meet our clients' needs, they can goelsewhere and unfortunately, some of them did." Efficiency is up and humanintervention is down.
Soldevila's implementation, along with several other organizational and processchanges, has brought some order to the help desk. No small task, considering PG&Econsists of 20,700 employees working in five business units. "We are freeing upresources from daily fire fighting. Service levels are being met and client expectationsare getting higher. Our systems are driving more transactions than ever before and theavailability and performance are at all time highs. Most importantly, users and seniorlevel management are confident in our ability to get the job done."
|Company: PG&E Corporation (San Francisco, Calif.) serving nearly 14 million California energy consumers. |
Size: $30 billion in assets and five business units.
Business Need: To integrate a number of disparate operations/network management applications to work together and look and feel like one while reducing costs and improving critical application availability with a minimal help desk and technical support staff.
IT Solution: Boole & Babbage MAX/Enterprise event management console integrated with HP Openview, Bay Networks Optivity, Remedy ARS, SYMON Visual Communications system, Boole & Babbage Mainview and PhonePoint, EPAGE, and Exchange.