Network Management And NNM Move To The Web

A User's View Of HP OpenView Network Node Manager 6.0.

Some useful improvements and significantadministrative enhancements, especially the Web Launcher, will allow more users toefficiently access the information contained in NNM. However, there are more databases tocontend with instead of fewer, which limits the scalability of the system.

HP's OpenView Network Node Manager (NNM) 6.0 has many exciting new features. Most ofthem make the system easier to use, with added flexibility to meet specific needs. Likemost things in life, however, it's not a panacea for all possible network management ills.Therefore, it's important to balance the identified strengths and limitations in order tocreate an optimal managed environment.

Automated Backup is probably the most exciting new feature of NNM 6.0. Older NNMversions forced you to shut down the entire NNM suite of daemons in order to back up adatabase, an unacceptable procedure for many 24x7 operations centers. Now, NNM"freezes" the state of the network so that the various databases can be insynchronization when the databases are copied.

This is done with the new ovpause command. The OpenView database directory is copied toa temporary directory specified by the administrator. Once the copy is complete, theovresume command is executed and the "frozen" daemons function again. Thetemporary directory can then be copied to tape at the administrator's discretion.

Not surprisingly, during the backup procedure, writing to data cannot occur. In fact,the map and alarm displays are frozen, so no panning or event acknowledgements can bedone. While this is limiting, the overall down time of the system is reduced from 20 to 60minutes for existing setups to two to three minutes. I think most network managers willaccept the tradeoff.

NNM has several extremely good administrative enhancements. To date, the UNIX andWindows NT versions have had some significant differences in functionality; but not anylonger. For example, the UNIX version contains a command line option for map statuspropagation or map copy/delete. These features are now in the NT version as well.

Even more noticeable, is the similarity between the pull-down menus. In NNM 5.01, theNT pull-down menus had a distinctly Microsoft Windows style. Now in both the UNIX and NTversions, Windows pull-downs are the rule.

Unfortunately, however, a number of features continue to distinguish the UNIX versionfrom its NT counterpart. IPX discovery will remain in the NT version, as will submap help,cascading submap display, DMI and SMS integration. The NT Tools menu will remain NT only.But most of these functions are clearly in the PC/NT category anyway.

Remote console configurations are now interchangeable. That's good news because noweither an NT or a UNIX GUI can be attached to either a UNIX or NT server. The biggestplus: an NT console can be attached to a UNIX server. This has the potential of greatlyreducing the cost of network operations center equipment, a possibility that should scorebig with corporate IT departments.

Collection Selections
NNM 6.0 has the ability to mix and match NT and UNIX Collection Stations (CS) andManagement Stations (MS). Previously, you could attach an NT or UNIX CS to a UNIX MS, butonly an NT machine could be used as a standalone MS or a CS. While you now have theability to attach a UNIX CS to an NT MS, this probably won't happen much.

However, NT-centric shops can now apply CS-MS configurations with strictly NT boxes.There's still the question, however, of whether NT has the horsepower yet to workeffectively in the large environments that typically need a Distributed Management (DIM)architecture.

Another nice new feature is the map import/export function. Internet level maps can besaved to an ASCII file and import maps from ASCII files. This is good for making many maplayout changes while having a backup. This feature works very well; however, it works forthe Internet-level only and does not save user plane symbols. While I don't believe it'ssupported, I was able to export a map from a UNIX system and import that map into an NTsystem (and vice versa).

The categories of NNM filters keep growing. In the previous version, you had fourdifferent kinds of filters to apply for managing your environment. Now there are seven:

Map. This filter selectively limits what is presented to the user in amap. It's an automated way of only allowing a particular user to see devices that aretheir responsibility.

Persistence. This filter forces some submaps that would normally betransient to reside in memory. It's used mainly for compatibility with third-partyapplications that assume needed objects are in memory. In practice, this is rarely used,since submaps that are custom drawn (which is most of the time) permanently reside inmemory anyway.

Topology. Topology filters are used in a DIM environment to describewhat objects from a collection station will be forwarded to a management station. Whilethe MS-CS architecture looks logically like a master-slave relationship, in realitymanagement stations and collection stations are peer autonomous systems. The topologyfilter is a collection station's way of telling a management station what it haspermission to see.

Discovery. Discovery filters help minimize the number of objects inthe database by only allowing objects which pass a certain criteria into the database. Theproblem with a discovery filter is that an object has to be demand polled before it'schecked against discovery logic. This means that netmon will constantly demand poll thesame objects in the network before discarding them.

A much better option is to use the noDiscover filter, the Discovery filter'scomplement. The noDiscover filter is checked before the demand poll is done. If the IPaddress to be polled is in the noDiscover file, the object is not demand polled. This ismuch more efficient. However, because NNM has no knowledge of what kind of device anobject is before the object is polled, noDiscover filters can only filter by IP address.

Failover. The first of three new filters in NNM 6.0, the failoverfilter is designed to tell a management station what objects from a collection station themanagement station should take over polling. This is a very big improvement over previousNNM environments. Many configurations use failover with a collection of complicatedscripts that make command line changes to netmon's polling attributes. This is now greatlysimplified. Keep in mind, howeverr, that database sizes will increase substantially whenimplementing any failover strategy.

DHCP. More and more enterprises are using Dynamic Host ConfigurationProtocol (DHCP) to dynamically allocate IP addresses for PCs. This makes addressmanagement easier, but the constantly changing IP addresses cause complications formanaging objects in NNM. This filter allows you to identify the range of addresses thatcomprise your DHCP address space. Once NNM knows this, you can choose several differentways to manage these addresses. This can all be done via a Java GUI.

Important Nodes. The Important Nodes filter is used in conjunctionwith Event Correlation Services (ECS) to allow netmon to override downstream eventsuppression for critical nodes in the network. Usually, if a router is unreachable, eventsuppression will tell netmon not to bother polling any devices downstream from that routerbecause they will be unreachable.

However, there may be critical devices you still want to identify beyond the rootproblem; or you may have the ability to dial back into a far side device via ISDN. TheImportant Nodes filter gives you the ability to identify these critical devices and havenetmon continue polling them. Filters provide the ability to automate much of the networkdiscovery and maintenance processes. The good thing about these filters is that they areASCII files that are easy to read and edit.

Jump On The Web
The Web Launcher is the jumping off point for all Web-based users of the system. Notonly does the launcher provide the start point for Network Node Manager, but also anyother applications that are integrated into OpenView. There are Web Launcher RegistrationFiles (WLRF) that allow new third-party applications to integrate into OpenView, which iswhere you'll see these applications integrate into OpenView in the future.

This is so straightforward to use, it will make it very easy for future applications toquickly integrate. The main drawback to the Web Launcher is that it gives you a falsesense of administrative security. Users can be registered to go through an access screento get onto the system, however, if they figure out the absolute path to the application,this "front door" security is completely bypassed. Security concerns aside, theWeb Launcher was a wise move by HP because users are demanding the ease of use anduniversal accessibility that a Web interface provides.

The alarm browser also has some key enhancements, but the most important is that alarmacknowledgements and deletes are applied throughout the entire environment. This is incontrast to the previous browser, where every user had their own copy of the alarms. Thissynchronization between event displays will increase staff coordination.

Draw Me A Map
Some significant Web influences have taken place in the network presenter map. It's nowcompletely Java-based. Previously, most of the pull-down menu items from the map wereunavailable. Now, almost all of these items are available in Network Presenter. However,the map continues to be read-only and no administrative actions can be taken through theJava interface (except ECS configuration, which can only be done through Java). The trendhere is that as new features and functions are added, they will be added as Java-capableapplications. NNM reflects the fact that the future of management is Web-enabled.

The SNMP Data Presenter provides a method of viewing SNMP data via the Web. Theapplication builder, which is a part of the Data Presenter, is an easy way to build customSNMP queries that can be executed via the Web Launcher.

I found this tool very easy to use. It's driven by a GUI dialog box. Previously, anySNMP collections needed to be manually added via application registration files (ARFs).It's important to note that the SNMP Data Presenter is not a replacement for moreall-encompassing performance management tools, such as NetMetrix.

Like the Data Presenter, the SNMP MIB browser is a Java-based application that can belaunched via the Web Launcher. But, the browser can also be launched from the pull-downmenus. This gives you the ability to traverse MIB trees for compiled MIBs, and thenperform an SNMP get against a particular host that supports the given MIB.

Correlation's purpose is to minimize the number of events that come from externalentities, in order to display to the system user the information that focuses on thefundamental issue that needs to be addressed. To date, NNM has had filtering, invoked viatrapd.conf, that allowed you to not view (or log) particular events. However, there was noway to associate related events, or count a number of duplicate events that came in overtime. OpenView's Event Correlation System (ECS) expands NNM's ability to add thisfunctionality. (See Circuits Maximus sidebar below.)

The Big Event
All events now go through the event correlation system. This happens before bothlogging the events and displaying them to the user. Second, trapd.log has been replaced bya binary event store. This is yet another database created as part of the core system. Itremains to be seen how this will impact performance, but I believe it may causecompatibility problems with some third-party applications.

Many people who customize NNM will tail the log file to parse out specific messages orto do their own homegrown correlation. With trapd.log disappearing, these applications andscripts may no longer function properly, if at all. There is a way to keep trapd.log, butit has to be specified during initial configuration or upgrade, since the HP default is toeliminate it. Those who do a lot of customization will need to pay close attention.

Last, but not least, in an effort to pave the way for more reporting, NNM 6.0 containsa new relational, ODBC-compliant database, which allows for greater flexibility inextracting data to various applications for reporting. This new data warehouse capabilityis read-only, extracting information from NNM's flat file databases: ovwdb, ipmap and thetopology database.

Consequently, the topology database will have to be back-ported to a flat file databaseif a user currently has it configured as relational. The procedure is documented. The keyto remember is that a re-migration plan will need to be put in place if you are using arelational database for your previous version of NNM.

--Jerald Murphy is director, Network Management with RPM Consulting(Columbia, Md.) and the Network Management Track Chair for the OpenView Forum.

CIRCUITS MAXIMUS

ECS out-of-the-box provides the user with four circuits to help correlate network events. A circuit is a set of logic that reads information from the environment and takes some defined action depending on the information received. Those circuits are Connector Down, Scheduled Maintenance, Repeated Event, and PairWise Correlation. Each of them helps reduce unneeded data and identify the root cause.

Connector Down. This is the main circuit for downstream event suppression. When netmon attempts to poll an unreachable device, NNM generates a critical "Node Down" event. The problem is, every device downstream from the first unreachable network element will also be unreachable. Previously, netmon would continue polling all downstream elements, causing excessive polling traffic and showing numerous devices on the map and in the alarm display as down.

This makes it difficult for operators to determine the real point of failure. The connector down circuit tells netmon to not bother polling those devices on the far side of the unreachable device. Additionally, it doesn't show alarms from those downstream devices and colors them on the map in blue for "unknown," which is really the case.

Scheduled Maintenance. The scheduled maintenance circuit allows users to tell NNM when routing maintenance is being performed on given devices, so that any alarms that would normally have come in from netmon or from traps will not get displayed. This is handy, because there's no reason to take action on an unavailable device if we know ahead of time about its unavailability.

Repeated Even. This circuit prevents the same alarm from being repeated numerous times on the alarm display. Often, when a device sends a trap, it's sent repeatedly until some action is taken. The Repeated Event Circuit doesn't display these duplicate events, helping to reduce screen clutter.

PairWise Correlation. This circuit allows the system to pair two events together and take some action. It is often the case that a "linkDown" alarm will be followed shortly by a "linkUp" alarm. This can happen when circuits are in transient states. These frequently have no perceived impact on the network user, but each state transition gets recorded in the NNM event display. PairWise Correlation allows you to tie the "linkUp" with a previous "linkDown" (from the same device!) and suppress showing any alarms.

Parameters can be set for each of these circuits (affected IP addresses, enabling or disabling circuits, etc.) and all of the circuits can be configured with the ECS configuration GUI. However, you can only change parameter values, not the logic of the circuits. In order to change circuit logic, or to create your own circuits, you need to purchase the ECS Designer. The good news here is that any circuit made with the designer can be loaded into any existing NNM 6.0 system without additional license issues. This is particularly good news for consultants, who often have to build similar types of logic for different clients.

--J.M.

Must Read Articles