Weblogic Authors: Yeshim Deniz, Elizabeth White, Michael Meiner, Michael Bushong, Avi Rosenthal

Related Topics: Weblogic

Weblogic: Article

High-Performance CMP Features

High-Performance CMP Features

This month I've decided to explore some of the more advanced performance enhancements that you can use if you are using EJB 2.0 on WebLogic. Our container-managed persistence (CMP) engine exposes several strategies for you to configure to get the most efficient - meaning least - use of your database. Field-groups allow you to specify which fields are loaded from the database together. Relationship caching tells the CMP engine to load the related bean when the parent bean is loaded. Cache-between-transactions allows you to cache the contents of entity beans between transactions against that bean. This is combined with Optimistic concurrency to get very good guarantees. Finally, there is ReadOnly concurrency, which gives you great performance with the ability to flush programmatically or through timeout. Using these optimization strategies, you will easily surpass the performance of naively written bean-managed persistence (BMP) entities and even exceed the performance of perfectly written J2EE-compliant BMP entities without the added maintenance and complexity that writing your own persistence layer can entail.

When you access a bean through a finder the CMP engine will by default load all the fields of the entity from the datastore. In many circumstances there are fields that you know you're not going to use in a certain code path; these fields need not be queried nor read from the database. Using field-groups you can specify which fields are to be loaded and the other fields will be loaded on demand. Field-groups do not stop you from eventually accessing the field and loading the data; it's just put off until the field is accessed by the application, or never loaded at all if you never access it. As a simple example, imagine a search application. The entities that you are searching for might contain the following fields: URL, Summary, Date, Keywords, and Cached Content. When you do your initial query of the database and return results for the user to look at, you probably don't want to load the cached content. It is a large field and if the user doesn't look at all of the results you're wasting a lot of work reading from the database.

In this example you would put the URL, Summary, and Date in a field-group and assign that group to the findByKeywords finder in the entity bean. When the finder was called, you'd get a collection of results with their designated fields prepopulated, and if the user happened to ask for the cached content the CMP engine would automatically go back out to the database and populate it. Some experimentation and benchmarking may be required to get the field-groups exactly right. Sometimes, if you misidentify which fields are really needed, you can reduce performance by not loading a field that is often used after the query. In order to activate this optimization you simply declare what groups each CMP field belongs to using the group-names attribute on the cmp-field entry and then associate a group with the finder using the group-name attribute.

Relationship Caching
Relationship caching is very important when the related data is usually used to access the parent bean. It reduces the number of SELECTs against the database by including the related bean fields in a SQL join. For one-to-one relationships this can offer a huge increase in performance because no extra work is done while you are reducing the SELECT statements. In the one-to-many case it will often depend on the fields present in your parent bean because under the join you will read those fields once for each related bean. I suggest that in this case you analyze the typical number of related beans and determine if it makes sense to take the extra per row performance hit in order to reduce the number of SELECT statements and the number of round-trips to the database.

As an example, if you have an Employee bean that also has a one-to-many relationship with Address beans, and when you access the Employee bean you often read the address data, you would probably enable this option because most Employee beans would have one or two related Address beans. In the case of the same Employee bean being related to PayrollStatement beans, you probably wouldn't want to enable relationship caching because the number of statements could be quite high and you would not be referencing them all every time you viewed the Employee bean.

Optimistic Concurrency
Perhaps the biggest performance increase you can get is by enabling cache-between-transactions and choosing Optimistic concurrency in uncontentious applications. Caching between transactions allows the CMP container to avoid returning to the database between every different use of the bean. Additionally, the application server will send out flushes on updates to the cached beans (even in a cluster) so that they will not be overly stale. With Optimistic concurrency enabled, any updates that are done have an included WHERE clause that checks to make sure that the row that is being updated hasn't been changed since the data was read from the database.

There are a number of options you can use to make this verification, including verifying that the read columns, the written columns, the version number, and the timestamp are the same. Each case has its own advantages but I would suggest that Version is probably the most universally applicable and may already be a column in your database. Column TYPE verification is the most expensive but the easiest to implement. To enable these optimizations you need to set the two flags on your bean and then change your update code to make sure that you handle the case when an OptimisticConcurrencyException might be thrown from a method that does an update or from an explicit commit statement for your bean-managed transaction.

ReadOnly Concurrency
Finally, there is ReadOnly concurrency, which implies caching between transactions. In this case, the data is only loaded from the database on the following conditions: the first read of the bean, the timeout has expired on the bean, or a programmatic flush of the bean was received. If you want to use ReadOnly beans and still occasionally change them, but don't want the overhead of Optimistic concurrency, you should have two beans that are backed by the same data - one ReadOnly and one normal EJB that can be updated. To programmatically flush your ReadOnly beans, simply cast the EJB home to weblogic.ejb.CachingHome and use the invalidate methods on that interface. For more extensive information on how to use these optimization strategies, please refer to http://edocs.bea.com/wls/docs70/ejb/index.html

More Stories By Sam Pullara

Sam Pullara has been a software engineer at WebLogic since 1996 and has contributed to the architecture, design, and implementation of many aspects of the application server.

e-mail: [email protected]

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

IoT & Smart Cities Stories
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...
Nicolas Fierro is CEO of MIMIR Blockchain Solutions. He is a programmer, technologist, and operations dev who has worked with Ethereum and blockchain since 2014. His knowledge in blockchain dates to when he performed dev ops services to the Ethereum Foundation as one the privileged few developers to work with the original core team in Switzerland.
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...
Whenever a new technology hits the high points of hype, everyone starts talking about it like it will solve all their business problems. Blockchain is one of those technologies. According to Gartner's latest report on the hype cycle of emerging technologies, blockchain has just passed the peak of their hype cycle curve. If you read the news articles about it, one would think it has taken over the technology world. No disruptive technology is without its challenges and potential impediments t...
If a machine can invent, does this mean the end of the patent system as we know it? The patent system, both in the US and Europe, allows companies to protect their inventions and helps foster innovation. However, Artificial Intelligence (AI) could be set to disrupt the patent system as we know it. This talk will examine how AI may change the patent landscape in the years to come. Furthermore, ways in which companies can best protect their AI related inventions will be examined from both a US and...
Bill Schmarzo, Tech Chair of "Big Data | Analytics" of upcoming CloudEXPO | DXWorldEXPO New York (November 12-13, 2018, New York City) today announced the outline and schedule of the track. "The track has been designed in experience/degree order," said Schmarzo. "So, that folks who attend the entire track can leave the conference with some of the skills necessary to get their work done when they get back to their offices. It actually ties back to some work that I'm doing at the University of San...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...