Weblogic Authors: Yeshim Deniz, Elizabeth White, Michael Meiner, Michael Bushong, Avi Rosenthal

Related Topics: Weblogic

Weblogic: Article

The Art of Capacity Planning

The Art of Capacity Planning

WebLogic Server runs on hardware ranging from PCs to high-end mainframes. Therefore, it's essential to carefully choose the level of hardware necessary to provide optimal performance for your WebLogic Server deployment.

This article focuses on the various steps involved in analyzing the constraints on your system components and enumerates several measures that can be taken to ensure that sufficient computing resources are available to support current and future usage levels.

What Is Capacity Planning?
Capacity planning is the process of determining what hardware and software configuration is required to adequately meet application needs. It helps define the number of concurrent users the system can support, the acceptable response times for the system, and the hardware and network infrastructure needed to handle those numbers.

It's important to realize that we cannot generalize this process because each application is different; instead we offer general guidelines that help estimate system capacity.

An iterative process, capacity planning is achieved by measuring the number of requests the server currently processes and how much demand each request places on the server resources, then using this data to calculate the computing resources (CPU, RAM, disk space, and network bandwidth) necessary to support current and future usage levels.

Why Is Capacity Planning Important?
The first and foremost reason is the user experience factor. Consider a system that can support only a given number of concurrent users while guaranteeing a reasonable response time. As many of us have experienced, when traffic increases on a major Web site that isn't adequately equipped to handle the surge, response time deteriorates significantly. Studies have shown that if a site's response time is more than 10 seconds, users tend to leave. This is generally a bad thing and should be avoided, as it's no secret that a Web site's downtime can result in a significant amount of lost business.

A second reason is that capacity planning helps you decide how to allocate resources for a system in terms of the CPUs, RAM, Internet connection bandwidth, and LAN infrastructure needed to support required performance levels and plan for future growth as well. Once we understand the limitations of the existing hardware configuration, we can estimate the amount of additional hardware needed to support any increased demands in performance.

Finally, capacity planning is important because it helps answer the question of what hardware and software infrastructure is needed to enable the current system deployment to achieve specified performance objectives.

Factors Affecting Capacity Planning
There are various factors to consider when conducting a capacity-planning exercise. Each of the following factors has a significant impact on system performance (and on system capacity as well). Before embarking on a capacity-planning exercise, it's essential to first tune the WebLogic Server for optimal performance. The WebLogic Performance and Tuning Guide (http://edocs.bea.com/wls/docs70/perform/index.html) covers this topic extensively.

Programmatic and Web-Based Clients
There are two types of clients that can connect to a WebLogic Server:

1.   Web-based clients, such as Web browsers and HTTP proxies, use the HTTP or HTTPS (secure) protocol to to communicate with the server. Such a client can be treated as a Web browser client generating HTTP requests.

2.   Programmatic clients rely on the T3 or the IIOP protocol and use RMI to connect to the server.

The stateless nature of HTTP requires that the server handle more in terms of overhead. However, the benefits of HTTP clients, such as the availability of browsers and firewall compatibility, are numerous and are usually worth the performance costs.

On the other hand, programmatic clients are generally more efficient than HTTP clients because the T3 protocol does more of the presentation work on the client side. Programmatic clients typically call directly into EJB while Web clients usually go through servlets. The T3 protocol operates using sockets and has a long-standing connection to the server. Consequently, the WebLogic Server can support a larger number of programmatic client threads, which needs to be factored in when calculating the client connectivity bandwidth.

Protocol Used with Clients
The protocol used for communication between the WebLogic Server and the clients is another factor in determining the capacity of the deployments. A commonly used protocol for secure transactions is the Secure Sockets Layer (SSL) protocol. SSL is a very computing-intensive technology and the overhead of cryptography can significantly decrease the number of simultaneous connections that a system can support. There is a direct correlation between the capacity of the WebLogic Server and the number of SSL client connections. SSL can significantly reduce the capacity of the server, depending on the strength of encryption used in the SSL connections. Typically, for every SSL connection the server can support, it can handle up to three non-SSL connections.

Database Server Capacity and User Storage Requirements
Most WebLogic Server deployments rely upon back-end systems such as databases. The more reliance on such back-end systems, the more resources are consumed to meet these requests. The key issues to consider are the size of the data being transferred and the processing capacity of the database server.

Oftentimes installations find that their database server runs out of capacity much sooner than the WebLogic Server does. You must plan for a database server that is sufficiently robust to handle the application. Typically, a good application will require a database three to four times more powerful than the application server hardware. Additionally, it's good practice to place the WebLogic Server and the database on separate machines.

The inability to increase the CPU utilization on the WebLogic Server by increasing the number of users is a common problem and generally a sign of bottlenecks in the system. A good place to start investigating would be the database. It's quite possible that the WebLogic Server is spending much of its time waiting for database operations to complete. Increasing the load by adding more users can only aggravate the situation.

An application might also require user storage for operations that don't interact with a database, for instance, a WebLogic-based security realm to store security information for each user. In such cases, you should calculate the size required to store each user's information and multiply this by the total number of expected users to come up with the total user storage requirements.

Concurrent Sessions and Processes
One of the main goals of capacity planning is to set quantifiable goals for the deployment infrastructure. This requires determining the maximum number of concurrent sessions the WebLogic Server will be called upon to handle. This affects capacity, as the WebLogic Server has to track session objects (HTTP session objects or stateful session beans) in memory for each session. Use the size of the session data to calculate the amount of RAM needed for each additional user. Next, research the maximum number of clients that will make requests at the same time, and the frequency of each client request. The number of user interactions with WebLogic Server per second represents the total number of interactions per second that a given WebLogic Server deployment should be able to handle.

It's also essential to identify up front frequently accessed components and to allocate adequate resources to them. Typically, for Web deployments users access JSP pages (or servlets), while users in application deployments access EJB.

Additional processes running on the same machine can significantly affect the capacity (and performance) of the WebLogic deployment. The database and Web servers are two popular choices for hosting on a separate machine.

The random and unpredictable nature of user service requests often exacerbates the performance problems of Internet applications. When estimating the peak load, it's therefore advisable to plan for demand spikes and focus on the worst-case scenario (for instance, the spike in the number of visitors to a site advertised during the Olympics telecast, for instance). Another example would be the spike in traffic experienced by many online retailers during the holiday shopping season. These usage spikes can often result in significant server overload unless properly anticipated.

WebLogic Server Configuration (Single Server or Clustered)
There are many advantages to using WebLogic Server clusters. For instance, they're much more efficient and they offer failover capabilities. The key issues to consider when using a cluster are:

1.   Clusters rely on LAN for communication between the nodes. Large clusters performing in-memory replication of session data for EJB or servlet sessions require more bandwidth than smaller clusters. Consider the size of session data, the size of the cluster, and the processing power of the individual machines in computing the LAN bandwidth and network connectivity requirements. The combination of server capacity and network bandwidth determines the total capacity of a given system.

2.   If you're using a Web server to forward requests to a WebLogic Server cluster, sometimes the Web server can be the bottleneck. This can happen when using the supplied HttpClusterServlet and a proxy server, or one of the supported plug-ins. If the response time doesn't improve after adding servers to the cluster, and the Web server machine shows a CPU usage of over 95%, consider clustering the Web server or running it on more powerful hardware.

WebLogic Server Configuration
It's possible to have many WebLogic Server instances clustered together on a single multiprocessor machine. An alternative would be to have a cluster of fewer WebLogic Server instances distributed across many single (or dual) processor machines. There can be advantages in using the second approach:

1.   Increased protection from failover, since it's unlikely that all the individual machines would fail at the same time.

2.   JVM scalability has some practical limitations, and garbage collection is often a bottleneck on systems with many processors. Configuring a cluster of many smaller machines will ensure good JVM scalability. Additionally, the impact of garbage collection on the system's response time can be somewhat mitigated, because garbage collection can be staggered across the different JVMs.

Figure 1 shows the results from an internal benchmark indicating that having multiple smaller boxes offers better performance. We hasten to add that it's quite likely some applications won't conform to this behavior.

Application Design Issues
At the end of the day, WebLogic Server is basically a platform for user applications. Badly designed or unoptimized user applications can drastically slow down the performance of a given configuration. Therefore it's also essential to optimize the application by eliminating or reducing the hot spots and considering the working set/concurrency issues. An end-to-end perspective of the application characteristics is essential in order to diagnose and fix any performance problems. Application optimization and performance tuning WebLogic Server for your specific deployments always go hand-in-hand.

Capacity Planning Guidelines
Once you've developed your application, the next step is to determine the hardware requirements (on the chosen hardware platform). It's essential to choose a transaction scenario that represents the most frequent user flow transactions. In the case of an online bookseller, for instance, a typical transaction mix could be as follows. A user enters the site via the home page, uses the search options to scan through existing inventory, browses through certain titles, selects a title and adds it to the shopping cart, proceeds to checkout, enters credit card information, confirms the order, and finally exits.

There are several tools available to simulate clients (LoadRunner, WebLOAD, etc.). Use the transaction mix you designed in the previous step to generate the client load. Gradually increase the client load by adding more concurrent users. This is an iterative process, and the goal is to achieve as high a CPU utilization as possible. If the CPU utilization doesn't increase (and hasn't yet peaked out) with the addition of more users, stop and look for bottlenecks (in the database or the application). There are several commercially available profilers (IntroScope, OptimizeIt, and JProbe) that can be used to identify these hot spots.

In a finely tuned system, the CPU utilization (at steady state) is usually in the 90-95% range. While throughput won't increase with the addition of more load, response times, on the other hand, will increase as more clients are added. The throughput at this point determines the capacity of the hardware.

Figure 2 shows that increasing the number of clients beyond a certain point doesn't increase the throughput but has a significant effect on the response time. From the graph, decide on an acceptable response time. This, then, indicates the capacity of the hardware needed.

The first step in capacity planning is to set measurable and quantifiable goals for the deployment. As is evident from the iterative process, capacity planning isn't an exact science, hence the need to conservatively estimate your capacity requirements. This is further compounded by the fact that the system load can often be quite unpredictable and can vary randomly (hence the need to focus on the worst-case scenario). Finally, there can be no substitute for load testing. During load testing it's essential to use a transaction scenario that closely resembles the real-world conditions the application deployment will be subjected to.

We would like to acknowledge Joginder Minocha for his work on the WebLogic Capacity Planning benchmarks.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

IoT & Smart Cities Stories
The challenges of aggregating data from consumer-oriented devices, such as wearable technologies and smart thermostats, are fairly well-understood. However, there are a new set of challenges for IoT devices that generate megabytes or gigabytes of data per second. Certainly, the infrastructure will have to change, as those volumes of data will likely overwhelm the available bandwidth for aggregating the data into a central repository. Ochandarena discusses a whole new way to think about your next...
DXWorldEXPO LLC announced today that Big Data Federation to Exhibit at the 22nd International CloudEXPO, colocated with DevOpsSUMMIT and DXWorldEXPO, November 12-13, 2018 in New York City. Big Data Federation, Inc. develops and applies artificial intelligence to predict financial and economic events that matter. The company uncovers patterns and precise drivers of performance and outcomes with the aid of machine-learning algorithms, big data, and fundamental analysis. Their products are deployed...
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...
All in Mobile is a place where we continually maximize their impact by fostering understanding, empathy, insights, creativity and joy. They believe that a truly useful and desirable mobile app doesn't need the brightest idea or the most advanced technology. A great product begins with understanding people. It's easy to think that customers will love your app, but can you justify it? They make sure your final app is something that users truly want and need. The only way to do this is by ...
CloudEXPO | DevOpsSUMMIT | DXWorldEXPO are the world's most influential, independent events where Cloud Computing was coined and where technology buyers and vendors meet to experience and discuss the big picture of Digital Transformation and all of the strategies, tactics, and tools they need to realize their goals. Sponsors of DXWorldEXPO | CloudEXPO benefit from unmatched branding, profile building and lead generation opportunities.
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...
Cell networks have the advantage of long-range communications, reaching an estimated 90% of the world. But cell networks such as 2G, 3G and LTE consume lots of power and were designed for connecting people. They are not optimized for low- or battery-powered devices or for IoT applications with infrequently transmitted data. Cell IoT modules that support narrow-band IoT and 4G cell networks will enable cell connectivity, device management, and app enablement for low-power wide-area network IoT. B...
The hierarchical architecture that distributes "compute" within the network specially at the edge can enable new services by harnessing emerging technologies. But Edge-Compute comes at increased cost that needs to be managed and potentially augmented by creative architecture solutions as there will always a catching-up with the capacity demands. Processing power in smartphones has enhanced YoY and there is increasingly spare compute capacity that can be potentially pooled. Uber has successfully ...
SYS-CON Events announced today that CrowdReviews.com has been named “Media Sponsor” of SYS-CON's 22nd International Cloud Expo, which will take place on June 5–7, 2018, at the Javits Center in New York City, NY. CrowdReviews.com is a transparent online platform for determining which products and services are the best based on the opinion of the crowd. The crowd consists of Internet users that have experienced products and services first-hand and have an interest in letting other potential buye...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...