The Net Net: A Peek Into History
Enterprise Network Reporting is a historical performance review of your IT enterprise network. That should include network performance, system performance and key application performance. In doing all this you will have true service level reporting. Most IT managers would agree that an enterprise reporting solution would include automated data collection, analysis, and report generation in one simple-to-use solution. There are a number of solutions that meet those requirements such as HP OpenView NetMetrix, Concord Network Health and INS Enterprise Pro.
All these products are worthwhile and can be very valuable - but in some cases are overkill. They consider themselves the "master" or "authority" on your network. They assume nobody else is monitoring your network and poll everything. They also are notorious for short polling intervals and large volumes of data. Simply put, a lot of overhead network management traffic. As a network manager a few years ago I was always puzzled by why we needed another system to do polling on my network. I already had a management system, HP OpenView, banging away on my network polling and collecting SNMP data. Why can't my reporting system use what I already have and augment that if needed?
The exception to that is if your organization is looking to collect RMON performance data from network devices or network probes. For long term trending and analysis an enterprise reporting system as described above is a necessity. For everybody else, mainly those responsible for the fault management, those so-called enterprise-reporting packages can be quite expensive to buy and implement. So where does that leave us?
There are a number of ways to implement a reporting system that feeds information from your fault management system. If you are using HP OpenView Network Node Manager one of the easiest ways to do reporting is through the trapd.log file.
It's a little more complex with version 6 since the trapd.log file isn't created by default. Once you turn it on it works great. The idea is that if your "raw" events are in the log file, all you need to do is find or develop a program that gets the relevant fault data, compiles it and then reports on it. We have developed a number of these kinds of programs over the last few years - some more sophisticated than others.
Here is an example of a text report showing the number and different types of events from trapd.log for a one day period:
Total number of events:9983
Total number of events not in Event Browser:9573
Event counts for events not sent to Event Browser:
Authentication failure events:2252
Duplicate IP address events:80
Other events:7241
Total number of events in Event Browser:410
Average events per hour in Event Browser:17
Event counts for events sent to Event Browser:Interface events:198
ISDN events:25
Frame relay events:16
Router down events:2
Router unreachable events:2
Duplicate IP address events:5
IP.Check events:0
Firewall failover events:0
Messaging events:1
Node unmanaged events:0
Node added events:4
Node deleted events:2
Other events:28
This kind of information is great and most of you already have the raw data to produce these kinds of reports. The next two reports are using some of the same data but doing it graphically.
All of these reports and graphs were written in perl using information in the trapd.log file from HP OpenView network node manager. Most of the time these reports are e-mailed daily to key administrators. The final step is to archive and compress all old trapd.log files.
This gives you the data to do "historical" reporting. Once you have the data archived you can put a simple Web page together to access the historical reports. All the information is there. Give it a try and let me know how it turns out.
- Charles Hebert is President of Southernview Technologies, Inc.