|
YOUR FEEDBACK
Did you read today's front page stories & breaking news?
SYS-CON.TV |
TOP THREE LINKS YOU MUST CLICK ON Product Review Holistic Infrastructure Monitoring and Management
Holistic Infrastructure Monitoring and Management
Feb. 20, 2002 12:00 AM
WebLogic Server, like most applications, provides robust and detailed monitoring tools bundled with the basic application. The embedded monitoring and management provided by the WebLogic Console is extremely useful when diagnosing and repairing a problem once it has been isolated in the WebLogic Server. But this embedded point solution is of limited use in most real-world situations where the application server is just a single component in a system of components that are all vital to providing the end-user application. When trying to quickly diagnose a general problem with the end-user application, it is much more powerful and effective to view data from each component holistically as a part of the system rather than evaluating each component by itself. This holistic view becomes even more powerful when correlated with end-user application and business transaction performance. If business priorities and processes can be included in this view of the infrastructure, operations management becomes more valuable to the enterprise.
Point Solutions
The command line utilities available to operating systems in the Unix family are typical point solutions. Point solutions provide power and flexibility to the experienced administrator. At the application level, vendors package tools for the monitoring and management of their applications. The WebLogic Server Console provides detailed information about many aspects of WebLogic (current connections, JVM memory usage, database connection pool load, etc.). The common benefit of these point solutions is that they enable real-time system monitoring, troubleshooting, and resolution by the experienced administrator. Moreover, these tools are ultimately used to solve problems once they have been identified. One difficulty with point solutions, however, lies in their heterogeneity. It is not always self-evident to an experienced Unix administrator how to identify an errant process on a Windows platform, although the operation is nearly identical to that used on Unix. Similarly, an administrator familiar with WebLogic Server Console might have difficulty using a Web server or another application server's point solution. Point monitoring solutions demand a high level of specialization and domain expertise from administrators. A second difficulty is in their dispersion. With point solutions there is no visibility across the components, which can be of various types. Unfortunately, an operational problem with WebLogic Server may be related to an underlying problem with the host, the database server it's connected to, or the network layer. Because specialists are hired to fix particular problems, each using their own tools, problem isolation is slow and labor intensive, especially in large and highly specialized IT organizations. A third drawback to point solutions is that they tend to be reactive. They're used when the system is already down or performing poorly and the administrator is responding in real time to the problem as the business suffers the consequences. Lacking an aggregate tool, inventive teams have crafted scripts and other tools to facilitate these activities, but the maintenance of these tools requires even more expertise.
Infrastructure as a Whole
Suddenly, infrastructure managers could do more to support the enterprise because they were spending less time in the isolation phase of problem resolution. Infrastructure management was still principally reactive, but it was faster. For the most part, though, these goals were only achieved at the network layer. State models of IP-based networks are much easier to understand and manipulate than what is happening at the application level (see Figure 1). As network management solutions attempted to integrate infrastructure management and monitoring above OSI Layer 3, the problem of resolving a common view of the health of the infrastructure from disparate data sources becomes even more pronounced. Application vendors extended SNMP support to their products (WebLogic, for example, can be monitored via SNMP) as an attempt to enable a common infrastructure view. But SNMP can suffer from reliability and complexity issues, as well as being incompletely supported by some components in the infrastructure. Access to command-line utilities and Windows tools like Performance Monitor also remain critical in the complete management of infrastructure resources. Also, databases typically make most statistics available only via SQL. Agent-based systems are often proposed as an alternative comprehensive monitoring technique, but are difficult to manage and tax system resources unnecessarily. They also often lack access to some of the components you might need. Common access to system information from heterogeneous sources is vital. Most existing approaches to consolidation involve a Manager of Managers (MoM) system, which receives monitoring data from multiple sources. These systems tend to be prohibitively expensive to purchase, customize, and support. They also take considerable time to implement and require significant training to be used effectively. NOCpulse Command Center uses a plug-in framework to separate monitoring data from the access methods used to gather that data, providing the benefits of the agent-based and MoM systems without suffering the disadvantages. Lacking a cross-industry standard to reconcile the very different issues encountered when monitoring an application like WebLogic versus an operating system (and such a standard seems unlikely, even considering the distant future), a flexible, extendable product-based standard (like NOCpulse Command Center's plug-in framework) is the next best thing. Also key is a common data repository and interface. Command Center plug-ins access required metrics via multiple protocols, but the results are presented in a common format via a single Web-based user interface. Performance metrics collected from the infrastructure are gathered in a common data store, allowing easy data mining, historical event correlation, and root cause analysis through a shared report engine (see Figure 2). The result is a holistic view of all the components that make up an end-user application: up the stack from the operating system through the application layer to the network and vertically from server to server.
Infrastructure and the End User
Unfortunately, the limited end-user monitoring approach ignores the infrastructure. Web site slow? Administrators go to another solution to solve the problem. Web infrastructures grow quickly in complexity; an administrator might not be able to expediently and effectively correlate end-user performance issues with a particular component of their infrastructure. Holistic monitoring requires a common interface to both infrastructure health and end-user application performance. NOCpulse Command Center provides this common interface and allows a user to model a multi-step browser-based transaction through a point-and-click configuration tool for both remote and local monitoring. It may be of interest to see the performance of my e-commerce site at a sample of locations on the public network. But this needs to be triangulated against local performance. Customers were unable to place orders for a time. Do I need to complain to my network provider or do I need to scale up my Web server farm? Point end-user monitoring solutions cannot answer this question; a holistic approach that correlates user experience to infrastructure health can.
Infrastructure and the Enterprise
The final stage in the development of infrastructure management involves connecting business management to the infrastructure. It involves making the priorities and concerns of the business transparent within the view of the infrastructure. Now precious operations resources are not only able to resolve problems quickly, but they can be applied efficiently to where they matter most: to the most urgent problem, where urgency is determined by business priorities. Fundamentally, it amounts to quickly getting the right people, with the right tools and information, to the most important problem (as defined by the priorities of the business). For example, we might have problems with two applications: our CRM system is down and our credit card approval process is running slowly. From an operational perspective, the first problem may seem more severe, but when tied to business priorities, the risk of lost revenue mandates that effort be applied to the second problem first. A truly holistic management solution enables these types of decisions automatically. NOCpulse Command Center allows users to build arbitrary groups of components that correspond to business processes, transactions, customers, or end-user applications. The behaviors of each of these groups (thresholds, notification destinations, escalation procedures, etc.) can be set in accordance with the importance of each group to the business. Critical customer issues get raised and resolved while lower priority or tolerable problems wait until people are free to deal with them.
Conclusion
BEA WEBLOGIC LATEST STORIES
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
|
SYS-CON FEATURED WHITEPAPERS BREAKING NEWS FROM THE WIRES
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||