Weblogic Authors: Yeshim Deniz, Elizabeth White, Michael Meiner, Michael Bushong, Avi Rosenthal

Related Topics: Weblogic

Weblogic: Article

The Art of Capacity Planning

The Art of Capacity Planning

WebLogic Server runs on hardware ranging from PCs to high-end mainframes. Therefore, it's essential to carefully choose the level of hardware necessary to provide optimal performance for your WebLogic Server deployment.

This article focuses on the various steps involved in analyzing the constraints on your system components and enumerates several measures that can be taken to ensure that sufficient computing resources are available to support current and future usage levels.

What Is Capacity Planning?
Capacity planning is the process of determining what hardware and software configuration is required to adequately meet application needs. It helps define the number of concurrent users the system can support, the acceptable response times for the system, and the hardware and network infrastructure needed to handle those numbers.

It's important to realize that we cannot generalize this process because each application is different; instead we offer general guidelines that help estimate system capacity.

An iterative process, capacity planning is achieved by measuring the number of requests the server currently processes and how much demand each request places on the server resources, then using this data to calculate the computing resources (CPU, RAM, disk space, and network bandwidth) necessary to support current and future usage levels.

Why Is Capacity Planning Important?
The first and foremost reason is the user experience factor. Consider a system that can support only a given number of concurrent users while guaranteeing a reasonable response time. As many of us have experienced, when traffic increases on a major Web site that isn't adequately equipped to handle the surge, response time deteriorates significantly. Studies have shown that if a site's response time is more than 10 seconds, users tend to leave. This is generally a bad thing and should be avoided, as it's no secret that a Web site's downtime can result in a significant amount of lost business.

A second reason is that capacity planning helps you decide how to allocate resources for a system in terms of the CPUs, RAM, Internet connection bandwidth, and LAN infrastructure needed to support required performance levels and plan for future growth as well. Once we understand the limitations of the existing hardware configuration, we can estimate the amount of additional hardware needed to support any increased demands in performance.

Finally, capacity planning is important because it helps answer the question of what hardware and software infrastructure is needed to enable the current system deployment to achieve specified performance objectives.

Factors Affecting Capacity Planning
There are various factors to consider when conducting a capacity-planning exercise. Each of the following factors has a significant impact on system performance (and on system capacity as well). Before embarking on a capacity-planning exercise, it's essential to first tune the WebLogic Server for optimal performance. The WebLogic Performance and Tuning Guide (http://edocs.bea.com/wls/docs70/perform/index.html) covers this topic extensively.

Programmatic and Web-Based Clients
There are two types of clients that can connect to a WebLogic Server:

1.   Web-based clients, such as Web browsers and HTTP proxies, use the HTTP or HTTPS (secure) protocol to to communicate with the server. Such a client can be treated as a Web browser client generating HTTP requests.

2.   Programmatic clients rely on the T3 or the IIOP protocol and use RMI to connect to the server.

The stateless nature of HTTP requires that the server handle more in terms of overhead. However, the benefits of HTTP clients, such as the availability of browsers and firewall compatibility, are numerous and are usually worth the performance costs.

On the other hand, programmatic clients are generally more efficient than HTTP clients because the T3 protocol does more of the presentation work on the client side. Programmatic clients typically call directly into EJB while Web clients usually go through servlets. The T3 protocol operates using sockets and has a long-standing connection to the server. Consequently, the WebLogic Server can support a larger number of programmatic client threads, which needs to be factored in when calculating the client connectivity bandwidth.

Protocol Used with Clients
The protocol used for communication between the WebLogic Server and the clients is another factor in determining the capacity of the deployments. A commonly used protocol for secure transactions is the Secure Sockets Layer (SSL) protocol. SSL is a very computing-intensive technology and the overhead of cryptography can significantly decrease the number of simultaneous connections that a system can support. There is a direct correlation between the capacity of the WebLogic Server and the number of SSL client connections. SSL can significantly reduce the capacity of the server, depending on the strength of encryption used in the SSL connections. Typically, for every SSL connection the server can support, it can handle up to three non-SSL connections.

Database Server Capacity and User Storage Requirements
Most WebLogic Server deployments rely upon back-end systems such as databases. The more reliance on such back-end systems, the more resources are consumed to meet these requests. The key issues to consider are the size of the data being transferred and the processing capacity of the database server.

Oftentimes installations find that their database server runs out of capacity much sooner than the WebLogic Server does. You must plan for a database server that is sufficiently robust to handle the application. Typically, a good application will require a database three to four times more powerful than the application server hardware. Additionally, it's good practice to place the WebLogic Server and the database on separate machines.

The inability to increase the CPU utilization on the WebLogic Server by increasing the number of users is a common problem and generally a sign of bottlenecks in the system. A good place to start investigating would be the database. It's quite possible that the WebLogic Server is spending much of its time waiting for database operations to complete. Increasing the load by adding more users can only aggravate the situation.

An application might also require user storage for operations that don't interact with a database, for instance, a WebLogic-based security realm to store security information for each user. In such cases, you should calculate the size required to store each user's information and multiply this by the total number of expected users to come up with the total user storage requirements.

Concurrent Sessions and Processes
One of the main goals of capacity planning is to set quantifiable goals for the deployment infrastructure. This requires determining the maximum number of concurrent sessions the WebLogic Server will be called upon to handle. This affects capacity, as the WebLogic Server has to track session objects (HTTP session objects or stateful session beans) in memory for each session. Use the size of the session data to calculate the amount of RAM needed for each additional user. Next, research the maximum number of clients that will make requests at the same time, and the frequency of each client request. The number of user interactions with WebLogic Server per second represents the total number of interactions per second that a given WebLogic Server deployment should be able to handle.

It's also essential to identify up front frequently accessed components and to allocate adequate resources to them. Typically, for Web deployments users access JSP pages (or servlets), while users in application deployments access EJB.

Additional processes running on the same machine can significantly affect the capacity (and performance) of the WebLogic deployment. The database and Web servers are two popular choices for hosting on a separate machine.

The random and unpredictable nature of user service requests often exacerbates the performance problems of Internet applications. When estimating the peak load, it's therefore advisable to plan for demand spikes and focus on the worst-case scenario (for instance, the spike in the number of visitors to a site advertised during the Olympics telecast, for instance). Another example would be the spike in traffic experienced by many online retailers during the holiday shopping season. These usage spikes can often result in significant server overload unless properly anticipated.

WebLogic Server Configuration (Single Server or Clustered)
There are many advantages to using WebLogic Server clusters. For instance, they're much more efficient and they offer failover capabilities. The key issues to consider when using a cluster are:

1.   Clusters rely on LAN for communication between the nodes. Large clusters performing in-memory replication of session data for EJB or servlet sessions require more bandwidth than smaller clusters. Consider the size of session data, the size of the cluster, and the processing power of the individual machines in computing the LAN bandwidth and network connectivity requirements. The combination of server capacity and network bandwidth determines the total capacity of a given system.

2.   If you're using a Web server to forward requests to a WebLogic Server cluster, sometimes the Web server can be the bottleneck. This can happen when using the supplied HttpClusterServlet and a proxy server, or one of the supported plug-ins. If the response time doesn't improve after adding servers to the cluster, and the Web server machine shows a CPU usage of over 95%, consider clustering the Web server or running it on more powerful hardware.

WebLogic Server Configuration
It's possible to have many WebLogic Server instances clustered together on a single multiprocessor machine. An alternative would be to have a cluster of fewer WebLogic Server instances distributed across many single (or dual) processor machines. There can be advantages in using the second approach:

1.   Increased protection from failover, since it's unlikely that all the individual machines would fail at the same time.

2.   JVM scalability has some practical limitations, and garbage collection is often a bottleneck on systems with many processors. Configuring a cluster of many smaller machines will ensure good JVM scalability. Additionally, the impact of garbage collection on the system's response time can be somewhat mitigated, because garbage collection can be staggered across the different JVMs.

Figure 1 shows the results from an internal benchmark indicating that having multiple smaller boxes offers better performance. We hasten to add that it's quite likely some applications won't conform to this behavior.

Application Design Issues
At the end of the day, WebLogic Server is basically a platform for user applications. Badly designed or unoptimized user applications can drastically slow down the performance of a given configuration. Therefore it's also essential to optimize the application by eliminating or reducing the hot spots and considering the working set/concurrency issues. An end-to-end perspective of the application characteristics is essential in order to diagnose and fix any performance problems. Application optimization and performance tuning WebLogic Server for your specific deployments always go hand-in-hand.

Capacity Planning Guidelines
Once you've developed your application, the next step is to determine the hardware requirements (on the chosen hardware platform). It's essential to choose a transaction scenario that represents the most frequent user flow transactions. In the case of an online bookseller, for instance, a typical transaction mix could be as follows. A user enters the site via the home page, uses the search options to scan through existing inventory, browses through certain titles, selects a title and adds it to the shopping cart, proceeds to checkout, enters credit card information, confirms the order, and finally exits.

There are several tools available to simulate clients (LoadRunner, WebLOAD, etc.). Use the transaction mix you designed in the previous step to generate the client load. Gradually increase the client load by adding more concurrent users. This is an iterative process, and the goal is to achieve as high a CPU utilization as possible. If the CPU utilization doesn't increase (and hasn't yet peaked out) with the addition of more users, stop and look for bottlenecks (in the database or the application). There are several commercially available profilers (IntroScope, OptimizeIt, and JProbe) that can be used to identify these hot spots.

In a finely tuned system, the CPU utilization (at steady state) is usually in the 90-95% range. While throughput won't increase with the addition of more load, response times, on the other hand, will increase as more clients are added. The throughput at this point determines the capacity of the hardware.

Figure 2 shows that increasing the number of clients beyond a certain point doesn't increase the throughput but has a significant effect on the response time. From the graph, decide on an acceptable response time. This, then, indicates the capacity of the hardware needed.

The first step in capacity planning is to set measurable and quantifiable goals for the deployment. As is evident from the iterative process, capacity planning isn't an exact science, hence the need to conservatively estimate your capacity requirements. This is further compounded by the fact that the system load can often be quite unpredictable and can vary randomly (hence the need to focus on the worst-case scenario). Finally, there can be no substitute for load testing. During load testing it's essential to use a transaction scenario that closely resembles the real-world conditions the application deployment will be subjected to.

We would like to acknowledge Joginder Minocha for his work on the WebLogic Capacity Planning benchmarks.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

IoT & Smart Cities Stories
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...
Nicolas Fierro is CEO of MIMIR Blockchain Solutions. He is a programmer, technologist, and operations dev who has worked with Ethereum and blockchain since 2014. His knowledge in blockchain dates to when he performed dev ops services to the Ethereum Foundation as one the privileged few developers to work with the original core team in Switzerland.
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...
Whenever a new technology hits the high points of hype, everyone starts talking about it like it will solve all their business problems. Blockchain is one of those technologies. According to Gartner's latest report on the hype cycle of emerging technologies, blockchain has just passed the peak of their hype cycle curve. If you read the news articles about it, one would think it has taken over the technology world. No disruptive technology is without its challenges and potential impediments t...
If a machine can invent, does this mean the end of the patent system as we know it? The patent system, both in the US and Europe, allows companies to protect their inventions and helps foster innovation. However, Artificial Intelligence (AI) could be set to disrupt the patent system as we know it. This talk will examine how AI may change the patent landscape in the years to come. Furthermore, ways in which companies can best protect their AI related inventions will be examined from both a US and...
Bill Schmarzo, Tech Chair of "Big Data | Analytics" of upcoming CloudEXPO | DXWorldEXPO New York (November 12-13, 2018, New York City) today announced the outline and schedule of the track. "The track has been designed in experience/degree order," said Schmarzo. "So, that folks who attend the entire track can leave the conference with some of the skills necessary to get their work done when they get back to their offices. It actually ties back to some work that I'm doing at the University of San...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...