|By Michael Kopp||
|May 25, 2011 11:15 AM EDT||
Most articles about Garbage Collection ignore the fact that the Sun Hotspot JVM is not the only game in town. In fact whenever you have to work with either IBM WebSphere or Oracle WebLogic you will run on a different runtime. While the concept of Garbage Collection is the same, the implementation is not and neither are the default settings or how to tune it. This often leads to unexpected problems when running the first load tests or in the worst case when going live. So let’s look at the different JVMs, what makes them unique and how to ensure that Garbage Collection is running smooth.
The Garbage Collection ergonomics of the Sun Hotspot JVM
Everybody believes to know how Garbage Collection works in the Sun Hotspot JVM, but lets take a closer look for the purpose of reference.
The Generational Heap
The Hotspot JVM is always using a Generational Heap. Objects are first allocated in the young generation, specifically in the Eden area. Whenever the Eden space is full a young generation garbage collection is triggered. This will copy the few remaining live objects into the empty survivor space. In addition objects that have been copied to Survivor in the previous garbage collection will be checked and the live ones will be copied as well. The result is that objects only exist in one survivor, while eden and the other survivor is empty. This form of Garbage Collection is called copy collection. It is fast as long as nearly all objects have died. In addition allocation is always fast because no fragmentation occurs. Objects that survive a couple of garbage collections are considered old and are promoted into the Tenured/Old space.
Tenured Generation GCs
The Mark and Sweep algorithms used in the Tenured space are different because they do not copy objects. As we have seen in one of my previous posts garbage collection takes longer the more objects are alive. Consequently GC runs in tenured are nearly always expensive which is why we want to avoid them. In order to avoid GCs we need to ensure that objects are only copied from Young to Old when they are permanent and in addition ensure that the tenured does not run full. Therefore generation sizing is the single most important optimization for the GC in the Hotspot JVM. If we cannot prevent objects from being copied to Tenured space once in a while we can use the Concurrent Mark and Sweep algorithm which collects objects concurrent to the application.
While that shortens the suspensions it does not prevent them and they will occur more frequently. The Tenured space also suffers from another problem, fragmentation. Fragmentation leads to slower allocation, longer sweep phases and eventually out of memory errors when the holes get too small for big objects.
This is remedied by a compacting phase. The serial and parallel compacting GC perform compaction for every GC run in the Tenured space. Important to note is that, while the parallel GC performs compacting every time, it does not compact the whole Tenured heap but just the area that is worth the effort. Worth the effort means when the heap has reached a certain level of fragmentation. In contrast, the Concurrent Mark and Sweep does not compact at all. Once objects cannot be allocated anymore a serial major GC is triggered. When choosing the concurrent mark and sweep strategy we have to be aware of that side affect.
The second big tuning option is therefore the choice of the right GC strategy. It has big implications for the impact the GC has on the application performance. The last and least known tuning option is around fragmentation and compacting. The Hotspot JVM does not provide a lot of options to tune it, so the only way is to tune the code directly and reduce the number of allocations.
There is another space in the Hotspot JVM that we all came to love over the years, the Permanent Generation. It holds classes and string constants that are part of those classes. While Garbage Collection is executed in the permanent generation, it only happens during a major GC. You might want to read up what a Major GC actually is, as it does not mean a Old Generation GC. Because a major GC does not happen often and mostly nothing happens in the permanent generation, many people think that the Hotspot JVM does not do garbage collection there at all.
Over the years all of us run into many different forms of the OutOfMemory situations in PermGen and you will be happy to hear that Oracle intends to do away with it in the future versions of Hotspot.
Now that we had a look at Hotspot, let us look at the difference in the Oracle JRockit. JRockit is used by Oracle WebLogic Server and Oracle has announced that it will merge it with the Hotspot JVM in the future.
The biggest difference is the heap strategy itself. While Oracle JRockit does have a generational heap it also supports a so called continuous heap. In addition the generational heap looks different as well.
The Young space is called Nursery and it only has two areas. When objects are first allocated they are placed in a so called Keep Area. Objects in the Keep Area are not considered during garbage collection while all other objects still alive are immediately promoted to tenured. That has major implications for the sizing of the Nursery. While you can configure how often objects are copied between the two survivors in the Hotspot JVM, JRockit promotes objects in the second Young Generation GC.
In addition to this difference JRockit also supports a completely continuous Heap that does not distinguish between young and old objects. In certain situations, like throughput orientated batch jobs, this results in better overall performance. The problem is that this is the default setting on a server JVM and often not the right choice. A typical Web Application is not throughput but response time orientated and you will need to explicitly choose the low pause time garbage collection mode or a generational garbage collection strategy.
Mostly Concurrent Mark and Sweep
If you choose Concurrent Mark and Sweep strategy you should be aware about a couple of differences here as well. The mostly concurrent mark phase is divided into four parts:
- Initial marking, where the root set of live objects is identified. This is done while the Java threads are paused.
- Concurrent marking, where the references from the root set are followed in order to find and mark the rest of the live objects in the heap. This is done while the Java threads are running.
- Precleaning, where changes in the heap during the concurrent mark phase are identified and any additional live objects are found and marked. This is done while the Java threads are running.
- Final marking, where changes during the precleaning phase are identified and any additional live objects are found and marked. This is done while the Java threads are paused.
The sweeping is also done concurrent to your application, but in contrast to Hotspot in two separate steps. It is first sweeping the first half of the heap. During this phase threads are allowed to allocate objects in the second half. After a short synchronization pause the second half is sweeped. This is followed by another short final synchronization pause. The JRockit algorithm therefore stops more often than the Sun Hotspot JVM, but the remark phase should be shorter. Unlike the Hotspot JVM you can tune the CMS by defining the percentage of free memory that triggers a GC run.
The JRockit does compacting for all Tenured Generation GCs, including the Concurrent Mark and Sweep. It does so in an incremental mode for portions of the heap. You can tune this with various options like percentage of heap that should be compacted each time or how many objects are compacted at max. In addition you can turn off compacting completely or force a full one for every GC. This means that compacting is a lot more tunable in the JRockit than in the Hotspot JVM and the optimum depends very much on the application itself and needs to be carefully tested.
Thread Local Allocation
Hotspot does use thread local allocation, but it is hard to find anything in the documentation about it or how to tune it. The JRockit uses this on default. This allows threads to allocate objects without any need for synchronization, which is beneficial for allocation speed. The size of a TLA can be configured and a large TLA can be beneficial for applications where multiple threads allocate a lot of objects. On the other hand a too large TLA can lead to more fragmentation. As a TLA is used exclusively by one thread, the size is naturally limited by the number of threads. Thus both decreasing and increasing the default can be good or bad depending on your applications architecture.
Large and small objects
The JRockit differentiates between large and small objects during allocation. The limit for when an object is considered large depends on the JVM version, the heap size, the garbage collection strategy and the platform used. It is usually somewhere between 2 and 128 KB. Large objects are allocated outside thread local area in in case of a generational heap directly in the old generation. This makes a lot of sense when you start thinking about it. The young generation uses a copy ccollection. At some point copying an object becomes more expensive than traversing it in ever garbage collection.
No permanent Generation
And finally it needs to be noted that the JRockit does not have a permanent generation. All classes and string constants are allocated within the normal heap area. While that makes life easier on the configuration front it means that classes can be garbage collected immediately if not used anymore. In one of my future posts I will illustrate how this can lead to some hard to find performance problems.
The IBM JVM
The IBM JVM shares a lot of characteristics with JRockit: The default heap is a continuous one. Especially in WebSphere installation this is often the initial cause for bad performance. It differentiates between large and small objects with the same implications and uses thread local allocation on default. It also does not have a permanent generation, but while the IBM JVM also supports a generational Heap model it looks more like Sun’s rather than JRockit.
Allocate and Survivor act like Eden and Survivor of the Sun JVM. New objects are allocated in one area and copied to the other on garbage collection. In contrast to JRockit the two areas are switched upon gc. This means that an object is copied multiple times between the two areas before it gets promoted to Tenured. Like JRockit the IBM JVM has more options to tune the compaction phase. You can turn it off or force it to happen for every GC. In contrast to JRockit the default triggers it due to a series of triggers but will then lead to a full compaction. This can be changed to an incremental one via a configuration flag.
We see that while the three JVMs are essentially trying to achieve the same goal, they do so via different strategies. This leads to different behaviour that needs tuning. With Java 7 Oracle will finally declare the G1 (Garbage First) production ready and the G1 is a different beast altogether, so stay tuned.
If you’re interested in hearing me discuss more about WebSphere in a production environment, then check out our upcoming webinar with The Bon-Ton Stores. I’ll be joined by Dan Gerard, VP of Technical & Web Services at Bon-Ton, to discuss the challenges they’ve overcome in operating a complex Websphere production eCommerce site to deliver great web application performance and user experience. Reserve your seat today to hear me go into more detail about Websphere and production eCommerce environments.
- The impact of Garbage Collection on Java performance // In my last post I explained what a major...
- Major GCs – Separating Myth from Reality In a recent post we have shown how the Java...
- JDK6 Update 23 changes CMS Collection counters Stefan Frandl, Test Automation Team Lead at dynaTrace, recently tested...
- The Top Java Memory Problems – Part 1 // Memory and Garbage Collection problems are still the most...
- Troubleshooting response time problems – why you cannot trust your system metrics // Production Monitoring is about ensuring the stability and health...
As enterprises work to take advantage of Big Data technologies, they frequently become distracted by product-level decisions. In most new Big Data builds this approach is completely counter-productive: it presupposes tools that may not be a fit for development teams, forces IT to take on the burden of evaluating and maintaining unfamiliar technology, and represents a major up-front expense. In his session at @BigDataExpo at @ThingsExpo, Andrew Warfield, CTO and Co-Founder of Coho Data, will dis...
Feb. 6, 2016 07:15 PM EST
SYS-CON Events announced today that Fusion, a leading provider of cloud services, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Fusion, a leading provider of integrated cloud solutions to small, medium and large businesses, is the industry's single source for the cloud. Fusion's advanced, proprietary cloud service platform enables the integration of leading edge solutions in the cloud, including clou...
Feb. 6, 2016 03:30 PM EST Reads: 704
With the Apple Watch making its way onto wrists all over the world, it’s only a matter of time before it becomes a staple in the workplace. In fact, Forrester reported that 68 percent of technology and business decision-makers characterize wearables as a top priority for 2015. Recognizing their business value early on, FinancialForce.com was the first to bring ERP to wearables, helping streamline communication across front and back office functions. In his session at @ThingsExpo, Kevin Roberts...
Feb. 6, 2016 03:15 PM EST Reads: 324
SYS-CON Events announced today that Commvault, a global leader in enterprise data protection and information management, has been named “Bronze Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Commvault is a leading provider of data protection and information management...
Feb. 6, 2016 02:30 PM EST Reads: 351
SYS-CON Events announced today that Alert Logic, Inc., the leading provider of Security-as-a-Service solutions for the cloud, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Alert Logic, Inc., provides Security-as-a-Service for on-premises, cloud, and hybrid infrastructures, delivering deep security insight and continuous protection for customers at a lower cost than traditional security solutions. Ful...
Feb. 6, 2016 01:30 PM EST Reads: 341
SYS-CON Events announced today that VAI, a leading ERP software provider, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. VAI (Vormittag Associates, Inc.) is a leading independent mid-market ERP software developer renowned for its flexible solutions and ability to automate critical business functions for the distribution, manufacturing, specialty retail and service sectors. An IBM Premier Business Part...
Feb. 6, 2016 01:00 PM EST Reads: 536
The cloud promises new levels of agility and cost-savings for Big Data, data warehousing and analytics. But it’s challenging to understand all the options – from IaaS and PaaS to newer services like HaaS (Hadoop as a Service) and BDaaS (Big Data as a Service). In her session at @BigDataExpo at @ThingsExpo, Hannah Smalltree, a director at Cazena, will provide an educational overview of emerging “as-a-service” options for Big Data in the cloud. This is critical background for IT and data profes...
Feb. 6, 2016 11:00 AM EST Reads: 117
With an estimated 50 billion devices connected to the Internet by 2020, several industries will begin to expand their capabilities for retaining end point data at the edge to better utilize the range of data types and sheer volume of M2M data generated by the Internet of Things. In his session at @ThingsExpo, Don DeLoach, CEO and President of Infobright, will discuss the infrastructures businesses will need to implement to handle this explosion of data by providing specific use cases for filte...
Feb. 6, 2016 11:00 AM EST
Fortunately, meaningful and tangible business cases for IoT are plentiful in a broad array of industries and vertical markets. These range from simple warranty cost reduction for capital intensive assets, to minimizing downtime for vital business tools, to creating feedback loops improving product design, to improving and enhancing enterprise customer experiences. All of these business cases, which will be briefly explored in this session, hinge on cost effectively extracting relevant data from ...
Feb. 6, 2016 09:00 AM EST
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2015 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 ad...
Feb. 6, 2016 05:00 AM EST Reads: 328
Most people haven’t heard the word, “gamification,” even though they probably, and perhaps unwittingly, participate in it every day. Gamification is “the process of adding games or game-like elements to something (as a task) so as to encourage participation.” Further, gamification is about bringing game mechanics – rules, constructs, processes, and methods – into the real world in an effort to engage people. In his session at @ThingsExpo, Robert Endo, owner and engagement manager of Intrepid D...
Feb. 5, 2016 09:00 PM EST Reads: 772
Eighty percent of a data scientist’s time is spent gathering and cleaning up data, and 80% of all data is unstructured and almost never analyzed. Cognitive computing, in combination with Big Data, is changing the equation by creating data reservoirs and using natural language processing to enable analysis of unstructured data sources. This is impacting every aspect of the analytics profession from how data is mined (and by whom) to how it is delivered. This is not some futuristic vision: it's ha...
Feb. 2, 2016 02:00 PM EST Reads: 402
WebRTC has had a real tough three or four years, and so have those working with it. Only a few short years ago, the development world were excited about WebRTC and proclaiming how awesome it was. You might have played with the technology a couple of years ago, only to find the extra infrastructure requirements were painful to implement and poorly documented. This probably left a bitter taste in your mouth, especially when things went wrong.
Feb. 2, 2016 04:30 AM EST Reads: 838
Learn how IoT, cloud, social networks and last but not least, humans, can be integrated into a seamless integration of cooperative organisms both cybernetic and biological. This has been enabled by recent advances in IoT device capabilities, messaging frameworks, presence and collaboration services, where devices can share information and make independent and human assisted decisions based upon social status from other entities. In his session at @ThingsExpo, Michael Heydt, founder of Seamless...
Feb. 1, 2016 05:00 AM EST Reads: 922
The IoT's basic concept of collecting data from as many sources possible to drive better decision making, create process innovation and realize additional revenue has been in use at large enterprises with deep pockets for decades. So what has changed? In his session at @ThingsExpo, Prasanna Sivaramakrishnan, Solutions Architect at Red Hat, discussed the impact commodity hardware, ubiquitous connectivity, and innovations in open source software are having on the connected universe of people, thi...
Jan. 31, 2016 09:00 PM EST Reads: 717
WebRTC: together these advances have created a perfect storm of technologies that are disrupting and transforming classic communications models and ecosystems. In his session at WebRTC Summit, Cary Bran, VP of Innovation and New Ventures at Plantronics and PLT Labs, provided an overview of this technological shift, including associated business and consumer communications impacts, and opportunities it may enable, complement or entirely transform.
Jan. 31, 2016 07:15 PM EST Reads: 1,135
There are so many tools and techniques for data analytics that even for a data scientist the choices, possible systems, and even the types of data can be daunting. In his session at @ThingsExpo, Chris Harrold, Global CTO for Big Data Solutions for EMC Corporation, showed how to perform a simple, but meaningful analysis of social sentiment data using freely available tools that take only minutes to download and install. Participants received the download information, scripts, and complete end-t...
Jan. 31, 2016 10:00 AM EST Reads: 1,197
For manufacturers, the Internet of Things (IoT) represents a jumping-off point for innovation, jobs, and revenue creation. But to adequately seize the opportunity, manufacturers must design devices that are interconnected, can continually sense their environment and process huge amounts of data. As a first step, manufacturers must embrace a new product development ecosystem in order to support these products.
Jan. 31, 2016 10:00 AM EST Reads: 798
Manufacturing connected IoT versions of traditional products requires more than multiple deep technology skills. It also requires a shift in mindset, to realize that connected, sensor-enabled “things” act more like services than what we usually think of as products. In his session at @ThingsExpo, David Friedman, CEO and co-founder of Ayla Networks, discussed how when sensors start generating detailed real-world data about products and how they’re being used, smart manufacturers can use the dat...
Jan. 30, 2016 07:45 PM EST Reads: 772
When it comes to IoT in the enterprise, namely the commercial building and hospitality markets, a benefit not getting the attention it deserves is energy efficiency, and IoT’s direct impact on a cleaner, greener environment when installed in smart buildings. Until now clean technology was offered piecemeal and led with point solutions that require significant systems integration to orchestrate and deploy. There didn't exist a 'top down' approach that can manage and monitor the way a Smart Buildi...
Jan. 30, 2016 03:45 PM EST Reads: 1,258