Welcome!

Weblogic Authors: Yeshim Deniz, Elizabeth White, Michael Meiner, Michael Bushong, Avi Rosenthal

Related Topics: Weblogic

Weblogic: Article

Holistic Infrastructure Monitoring and Management

Holistic Infrastructure Monitoring and Management

WebLogic Server, like most applications, provides robust and detailed monitoring tools bundled with the basic application. The embedded monitoring and management provided by the WebLogic Console is extremely useful when diagnosing and repairing a problem once it has been isolated in the WebLogic Server. But this embedded point solution is of limited use in most real-world situations where the application server is just a single component in a system of components that are all vital to providing the end-user application. When trying to quickly diagnose a general problem with the end-user application, it is much more powerful and effective to view data from each component holistically as a part of the system rather than evaluating each component by itself.

This holistic view becomes even more powerful when correlated with end-user application and business transaction performance. If business priorities and processes can be included in this view of the infrastructure, operations management becomes more valuable to the enterprise.

Point Solutions
A typical Web infrastructure is composed of a multitude of hardware and software components: hosts running various operating systems, Web servers, application servers, databases, legacy systems, and network devices. Each element of the infrastructure usually includes a point monitoring and management solution.

The command line utilities available to operating systems in the Unix family are typical point solutions. Point solutions provide power and flexibility to the experienced administrator. At the application level, vendors package tools for the monitoring and management of their applications. The WebLogic Server Console provides detailed information about many aspects of WebLogic (current connections, JVM memory usage, database connection pool load, etc.).

The common benefit of these point solutions is that they enable real-time system monitoring, troubleshooting, and resolution by the experienced administrator. Moreover, these tools are ultimately used to solve problems once they have been identified.

One difficulty with point solutions, however, lies in their heterogeneity. It is not always self-evident to an experienced Unix administrator how to identify an errant process on a Windows platform, although the operation is nearly identical to that used on Unix. Similarly, an administrator familiar with WebLogic Server Console might have difficulty using a Web server or another application server's point solution. Point monitoring solutions demand a high level of specialization and domain expertise from administrators.

A second difficulty is in their dispersion. With point solutions there is no visibility across the components, which can be of various types. Unfortunately, an operational problem with WebLogic Server may be related to an underlying problem with the host, the database server it's connected to, or the network layer. Because specialists are hired to fix particular problems, each using their own tools, problem isolation is slow and labor intensive, especially in large and highly specialized IT organizations.

A third drawback to point solutions is that they tend to be reactive. They're used when the system is already down or performing poorly and the administrator is responding in real time to the problem as the business suffers the consequences. Lacking an aggregate tool, inventive teams have crafted scripts and other tools to facilitate these activities, but the maintenance of these tools requires even more expertise.

Infrastructure as a Whole
Infrastructure managers quickly realized that the information they were getting from their various point solutions was much more valuable when viewed together as part of a larger whole. They realized that fault isolation and correlation became much easier once they had this larger view. In the network world, SNMP became the de facto standard for aggregating all of the element data and enabled master management consoles.

Suddenly, infrastructure managers could do more to support the enterprise because they were spending less time in the isolation phase of problem resolution. Infrastructure management was still principally reactive, but it was faster. For the most part, though, these goals were only achieved at the network layer. State models of IP-based networks are much easier to understand and manipulate than what is happening at the application level (see Figure 1).

As network management solutions attempted to integrate infrastructure management and monitoring above OSI Layer 3, the problem of resolving a common view of the health of the infrastructure from disparate data sources becomes even more pronounced. Application vendors extended SNMP support to their products (WebLogic, for example, can be monitored via SNMP) as an attempt to enable a common infrastructure view.

But SNMP can suffer from reliability and complexity issues, as well as being incompletely supported by some components in the infrastructure. Access to command-line utilities and Windows tools like Performance Monitor also remain critical in the complete management of infrastructure resources. Also, databases typically make most statistics available only via SQL.

Agent-based systems are often proposed as an alternative comprehensive monitoring technique, but are difficult to manage and tax system resources unnecessarily. They also often lack access to some of the components you might need.

Common access to system information from heterogeneous sources is vital. Most existing approaches to consolidation involve a Manager of Managers (MoM) system, which receives monitoring data from multiple sources. These systems tend to be prohibitively expensive to purchase, customize, and support. They also take considerable time to implement and require significant training to be used effectively.

NOCpulse Command Center uses a plug-in framework to separate monitoring data from the access methods used to gather that data, providing the benefits of the agent-based and MoM systems without suffering the disadvantages. Lacking a cross-industry standard to reconcile the very different issues encountered when monitoring an application like WebLogic versus an operating system (and such a standard seems unlikely, even considering the distant future), a flexible, extendable product-based standard (like NOCpulse Command Center's plug-in framework) is the next best thing.

Also key is a common data repository and interface. Command Center plug-ins access required metrics via multiple protocols, but the results are presented in a common format via a single Web-based user interface. Performance metrics collected from the infrastructure are gathered in a common data store, allowing easy data mining, historical event correlation, and root cause analysis through a shared report engine (see Figure 2).

The result is a holistic view of all the components that make up an end-user application: up the stack from the operating system through the application layer to the network and vertically from server to server.

Infrastructure and the End User
Too often the focus of monitoring is attention to system problems without regard to the real end-user impact. What is ultimately important is not the specific health of all of the individual components of an Internet infrastructure, but the performance of the application at the other end. Can our customers currently purchase a CD from our site? Is our billing system too slow? Does the customer service section of our site offer the help our customers need? The era of Web applications has given rise to point solutions for end-user monitoring. These products either quantify end-user experience or monitor site accessibility.

Unfortunately, the limited end-user monitoring approach ignores the infrastructure. Web site slow? Administrators go to another solution to solve the problem. Web infrastructures grow quickly in complexity; an administrator might not be able to expediently and effectively correlate end-user performance issues with a particular component of their infrastructure.

Holistic monitoring requires a common interface to both infrastructure health and end-user application performance. NOCpulse Command Center provides this common interface and allows a user to model a multi-step browser-based transaction through a point-and-click configuration tool for both remote and local monitoring.

It may be of interest to see the performance of my e-commerce site at a sample of locations on the public network. But this needs to be triangulated against local performance. Customers were unable to place orders for a time. Do I need to complain to my network provider or do I need to scale up my Web server farm? Point end-user monitoring solutions cannot answer this question; a holistic approach that correlates user experience to infrastructure health can.

Infrastructure and the Enterprise
Once we have a single view of the infrastructure that can be correlated to the performance of our end-user applications and business transactions, resources can be efficiently applied to quickly solve problems. The infrastructure can be tuned to provide better performance with a strong feedback loop of the metrics that matter, ensuring that changes actually have the intended effect. Service Level Agreements can be managed proactively; problems become defined by what the end user is experiencing or by metrics associated with the business transaction rather than by an arbitrary definition of "problem" at the infrastructure level.

The final stage in the development of infrastructure management involves connecting business management to the infrastructure. It involves making the priorities and concerns of the business transparent within the view of the infrastructure. Now precious operations resources are not only able to resolve problems quickly, but they can be applied efficiently to where they matter most: to the most urgent problem, where urgency is determined by business priorities. Fundamentally, it amounts to quickly getting the right people, with the right tools and information, to the most important problem (as defined by the priorities of the business).

For example, we might have problems with two applications: our CRM system is down and our credit card approval process is running slowly. From an operational perspective, the first problem may seem more severe, but when tied to business priorities, the risk of lost revenue mandates that effort be applied to the second problem first. A truly holistic management solution enables these types of decisions automatically.

NOCpulse Command Center allows users to build arbitrary groups of components that correspond to business processes, transactions, customers, or end-user applications. The behaviors of each of these groups (thresholds, notification destinations, escalation procedures, etc.) can be set in accordance with the importance of each group to the business. Critical customer issues get raised and resolved while lower priority or tolerable problems wait until people are free to deal with them.

Conclusion
Fundamentally, infrastructure management adds more and more value to the enterprise as it evolves from an inefficient, slow, reactive, element-focused approach to an efficient, responsive, proactive approach that is able to see infrastructure holistically and understand the relative importance of each end-user application to the business. When that level is achieved, infrastructure management becomes a true business enabler, allowing service level management, efficient customer problem reporting and resolution, and prioritization of fault response in accordance with business priorities. These benefits require a holistic operational view that provides the ability to correlate what is happening at the infrastructure level with what is happening at the business level.

More Stories By Greg Peters

Greg Peters, VP of Engineering at NOCpulse,
has over 10 years of experience in software development,
system engineering, and program management. He leads
the company's core product development strategy and
engineering teams.

More Stories By Lance Peterson

Lance Peterson, program manager at NOCpulse, is responsible
for interface design and engineering program management.
Previously, he worked as a programmer
analyst at Emory University and in an earlier life
taught literature, film, and [email protected]

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
"Akvelon is a software development company and we also provide consultancy services to folks who are looking to scale or accelerate their engineering roadmaps," explained Jeremiah Mothersell, Marketing Manager at Akvelon, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
It is of utmost importance for the future success of WebRTC to ensure that interoperability is operational between web browsers and any WebRTC-compliant client. To be guaranteed as operational and effective, interoperability must be tested extensively by establishing WebRTC data and media connections between different web browsers running on different devices and operating systems. In his session at WebRTC Summit at @ThingsExpo, Dr. Alex Gouaillard, CEO and Founder of CoSMo Software, presented ...
"There's plenty of bandwidth out there but it's never in the right place. So what Cedexis does is uses data to work out the best pathways to get data from the origin to the person who wants to get it," explained Simon Jones, Evangelist and Head of Marketing at Cedexis, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
WebRTC is great technology to build your own communication tools. It will be even more exciting experience it with advanced devices, such as a 360 Camera, 360 microphone, and a depth sensor camera. In his session at @ThingsExpo, Masashi Ganeko, a manager at INFOCOM Corporation, introduced two experimental projects from his team and what they learned from them. "Shotoku Tamago" uses the robot audition software HARK to track speakers in 360 video of a remote party. "Virtual Teleport" uses a multip...
"IBM is really all in on blockchain. We take a look at sort of the history of blockchain ledger technologies. It started out with bitcoin, Ethereum, and IBM evaluated these particular blockchain technologies and found they were anonymous and permissionless and that many companies were looking for permissioned blockchain," stated René Bostic, Technical VP of the IBM Cloud Unit in North America, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventi...
Gemini is Yahoo’s native and search advertising platform. To ensure the quality of a complex distributed system that spans multiple products and components and across various desktop websites and mobile app and web experiences – both Yahoo owned and operated and third-party syndication (supply), with complex interaction with more than a billion users and numerous advertisers globally (demand) – it becomes imperative to automate a set of end-to-end tests 24x7 to detect bugs and regression. In th...
SYS-CON Events announced today that Telecom Reseller has been named “Media Sponsor” of SYS-CON's 22nd International Cloud Expo, which will take place on June 5-7, 2018, at the Javits Center in New York, NY. Telecom Reseller reports on Unified Communications, UCaaS, BPaaS for enterprise and SMBs. They report extensively on both customer premises based solutions such as IP-PBX as well as cloud based and hosted platforms.
SYS-CON Events announced today that CrowdReviews.com has been named “Media Sponsor” of SYS-CON's 22nd International Cloud Expo, which will take place on June 5–7, 2018, at the Javits Center in New York City, NY. CrowdReviews.com is a transparent online platform for determining which products and services are the best based on the opinion of the crowd. The crowd consists of Internet users that have experienced products and services first-hand and have an interest in letting other potential buye...
"MobiDev is a software development company and we do complex, custom software development for everybody from entrepreneurs to large enterprises," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection between Coke and its customers. Digital signs pair software with high-resolution displays so that a message can be changed instantly based on what the operator wants to communicate or sell. In their Day 3 Keynote at 21st Cloud Expo, Greg Chambers, Global Group Director, Digital Innovation, Coca-Cola, and Vidya Nagarajan, a Senior Product Manager at Google, discussed how from store operations and ...
In his session at 21st Cloud Expo, Carl J. Levine, Senior Technical Evangelist for NS1, will objectively discuss how DNS is used to solve Digital Transformation challenges in large SaaS applications, CDNs, AdTech platforms, and other demanding use cases. Carl J. Levine is the Senior Technical Evangelist for NS1. A veteran of the Internet Infrastructure space, he has over a decade of experience with startups, networking protocols and Internet infrastructure, combined with the unique ability to it...
"Cloud Academy is an enterprise training platform for the cloud, specifically public clouds. We offer guided learning experiences on AWS, Azure, Google Cloud and all the surrounding methodologies and technologies that you need to know and your teams need to know in order to leverage the full benefits of the cloud," explained Alex Brower, VP of Marketing at Cloud Academy, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clar...
A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, whic...
SYS-CON Events announced today that Evatronix will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Evatronix SA offers comprehensive solutions in the design and implementation of electronic systems, in CAD / CAM deployment, and also is a designer and manufacturer of advanced 3D scanners for professional applications.
Leading companies, from the Global Fortune 500 to the smallest companies, are adopting hybrid cloud as the path to business advantage. Hybrid cloud depends on cloud services and on-premises infrastructure working in unison. Successful implementations require new levels of data mobility, enabled by an automated and seamless flow across on-premises and cloud resources. In his general session at 21st Cloud Expo, Greg Tevis, an IBM Storage Software Technical Strategist and Customer Solution Architec...
To get the most out of their data, successful companies are not focusing on queries and data lakes, they are actively integrating analytics into their operations with a data-first application development approach. Real-time adjustments to improve revenues, reduce costs, or mitigate risk rely on applications that minimize latency on a variety of data sources. In his session at @BigDataExpo, Jack Norris, Senior Vice President, Data and Applications at MapR Technologies, reviewed best practices to ...
An increasing number of companies are creating products that combine data with analytical capabilities. Running interactive queries on Big Data requires complex architectures to store and query data effectively, typically involving data streams, an choosing efficient file format/database and multiple independent systems that are tied together through custom-engineered pipelines. In his session at @BigDataExpo at @ThingsExpo, Tomer Levi, a senior software engineer at Intel’s Advanced Analytics gr...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things’). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing? IoT is not about the devices, it’s about the data consumed and generated. The devices are tools, mechanisms, conduits. In his session at Internet of Things at Cloud Expo | DXWor...
Everything run by electricity will eventually be connected to the Internet. Get ahead of the Internet of Things revolution. In his session at @ThingsExpo, Akvelon expert and IoT industry leader Sergey Grebnov provided an educational dive into the world of managing your home, workplace and all the devices they contain with the power of machine-based AI and intelligent Bot services for a completely streamlined experience.