
|
 |
|
General Information
|
Add To My Personal Library |
May 5, 2006
Vol.28 Issue 18 Page(s) 1 in print issue
|
Monitoring The Data Center
Tools & Tips For Effectively Keeping An Eye On Things
|
In a perfect world, servers and applications would be deployed in a data center and simply work as intended, flawlessly, until they were no longer needed. The reality, however, is that servers and applications frequently have issues of one kind or another that affect their ability to function. It is the job of the data center or server administrator to keep a watchful eye on things and make sure that the systems are operational. For smaller companies with only a handful of servers, an administrator may be able to just periodically check the status manually. But for larger companies or data centers where there are hundreds of servers and applications to monitor at any given time, the process of monitoring and alerting has to be automated in some way. Andre Muscat, product manager for GFI, says, A crucial aspect to protecting a data center is ensuring that all required services are running and performing as expected. . . Most monitoring tools limit their monitoring to a high level. High-level monitoring usually tests just that a server replies to a connection request. In some cases this is not enough because although the server is running, it is possible that subcomponents of it might be malfunctioning. This is why it is essential that a test replicates actual operations of the server to ensure that every stage of the server is running as desired.
What Should You Monitor? If you monitored every process, function, and event that occurred in a data center, you could be quickly overwhelmed with too much information and too many alerts. Effective monitoring requires planning to identify the processes or functions that need to be monitored. According to Ben Rothke, author of Computer Security: 20 Things Every Employee Should Know, If you dont care how many errors your print server generates, dont waste time logging such errors. But [effective monitoring] requires that an organization truly understands their infrastructure. They need to know how each component operates and which messages it generates. Rothke continues, The reason Wal-Mart is so successful with managing their inventories is that they know what they want, where it should be, and when it should be there. By applying the same level of diligence to logging and monitoring, organizations can create a logging and monitoring infrastructure that can add a lot of value.
Automatic Monitoring There are plenty of tools available to help administrators keep an eye on their networks. Tools such as GFI Network Server Monitor (www.gfi.com) or managed services such as DataVelocity Erudition (www.data-velocity.com) are designed to monitor the availability and functionality of network systems and alert the necessary personnel when there is a problem. GFI Network Server Monitor monitors all machines in the data center to ensure critical servers and services are running. It performs various checks, mimicking actual administrator procedures to ensure services are really running rather than just seeming to be functional. When a service is discovered to have stopped running, corrective measures can be attempted as well as sending alerts to technicians or administrators via email, pager, or SMS alerts. Erudition from DataVelocity is a managed monitoring service. Erudition communicates with systems on your network via a site-to-site VPN to ensure that all systems are functioning properly. Like the GFI product, there is no need to install any sort of agent software. If issues are identified, alerts are generated and can be delivered via email, pager, pop-up notification windows, etc. For additional product coverage, please see the Network Monitoring Tools & Services chart. Of course, there is no substitute for quality employees. Rothke says, Hands down, the best tool is a talented staff. Far too many organizations expect software and hardware to do everything, but the success of anything is ultimately dependent on the quality of the staff using that hardware or software. Really good staff can make due with mediocre tools. Lousy staff can take great tools and create meaningless reports for clueless management.
Avoid False Alarms It is important to watch over the systems in the data center and make sure that everything is functioning properly. One of the most important keys to an effective monitoring system is to make sure the proper events and processes are being watched and to minimize or eliminate false positives. False alarms can waste precious IT department resources as technicians are dispatched on wild goose chases trying to find problems that dont really exist. Not only is this a waste of time and effort, but the resources dedicated to hunting false positives also cuts into the resources available to handle legitimate concerns. But Dr. Anton Chuvakin, director of product management at LogLogic, says, False positives are much less of an issue than, say, three years ago. In addition, they only apply to NIDS [Network Intrusion Detection Systems] and NIPS [Network Intrusion Prevention Systems], and those represent a small portion of the entire monitoring spectrum. They still matter, but much less than three to five years ago. NIPS/NIDS vendors also got a lot better and have different technologies built in their tools to deal with them. According to Adam Warshaw of DataVelocity, We are able to rule out false positives by providing the ability to represent the environment, i.e. multiple systems, schedules, criteria persisting for a minimum duration, etc. by Tony Bradley
View the chart that accompanies this article. (NOTE: These pages are PDF (Portable Document Format) files. You will need Adobe Acrobat to view these pages. Download Adobe Acrobat Reader)
|
|