Welcome!

Weblogic Authors: Yeshim Deniz, Elizabeth White, Michael Meiner, Michael Bushong, Avi Rosenthal

Related Topics: Java IoT, IBM Cloud, Weblogic

Java IoT: Article

Java Feature — Concurrent Queries

A pattern for improving database query performance

In the ConcurrentQueryThreadImpl class, the runQuery() method first checks to see if any previously submitted query threads have finished and need to be reaped. This is important because the list of running threads is constrained so that too many queries can't run at once and overload the database server. So we want to get these threads processed and off the list first to make room for more query threads to be invoked. Once a query thread has been reaped then there's room on the list for another query thread. If there's room on the running threads list and there are queued queries waiting to be submitted (e.g., queries that previously had to wait because the running thread list was full) then they get submitted first before the query being passed to the runQuery() method. The query being passed in would then have to go onto the end of the list. Otherwise, if there's room on the running threads list and no queued queries, the caller's query will be immediately submitted.

The ConcurrentQueryThreadImpl class contains a private QueryThread class that extends Thread. This class starts a new thread, runs the SQL query, and holds onto the results (or an SQLException, if one occurred) until the ConcurrentQueryThreadImpl processes the results and removes the thread from the list. See Listing 5.

Once the ConcurrentQueryThreadImpl notices that the QueryThread is finished, it calls the processResults() method of the CanResolveAConcurrentQuery interface reference that the domain object has implemented, marks the processed object as reaped via the same interface, and removes the QueryThread from the list of running threads. Besides the getInstance() method that gives visibility into the singleton class, the public user interface for this class simply consists of the runQuery() and waitForAllQueriesToComplete() methods.

A Variation Using Thread Pools
In situations where concurrent queries can be used extensively, there might be some uneasiness about starting a new thread for each query and having it exit when that query is completed. In such cases, I'd recommend using a callable thread pool available in the java.util.concurrent package. Threads of this type would have an advantage over normal threads in that a) they can be pooled, b) they can throw an exception, and c) they can return a result. As an exercise, I've implemented a callable thread pool version of the QueryThread class that the ConcurrentQueryThreadImpl class can use to run queries. This class, a private class named QueryThreadPool, implements the Callable interface, instantiates a thread pool the size of the constraint of the maximum number of queries we want to have running at once, and puts the main unit of work of the thread inside the call() method. The source for the QueryThreadPool class is in Listing 6.

To make it easier to switch between the two threading models, a simple interface was extracted from the original QueryThread implementation named IsAConcurrentQueryThreadRunner, mandating the following methods: getResultSet(), getSQLException(), and isAlive(). See IsAConcurrentQueryThreadRunner.java below.

package net.sourceforge.concurrentQuery.article.concurrent;

import java.sql.ResultSet;
import java.sql.SQLException;

public interface IsAConcurrentQueryThreadRunner {

    public ResultSet getResultSet();
    public SQLException getSQLException();
    public boolean isAlive();
}

This interface is used on the ConcurrentHashMap lists that hold the references to the running query threads. Now, it's possible to change a few references of the QueryThread to the QueryThreadPool and vice-versa to switch between the two threading models. Of course, a factory to create the threading model based on a properties file would be more efficient, but outside the immediate scope of our discussion. The entire source for the ConcurrentQueryThreadImpl class is in Listing 7.

A Second, More Robust Implementation
To demonstrate use further, I've put together a more elaborate implementation of this pattern that builds a large object from real database queries. This database has one table that lists cities with large populations, their districts (or states), and the countries in which they reside. For this example, I have built a single object that contains a list of countries that have more than 75 cities. The CountryList object contains a list of its districts, each district contains a list of its cities. All of this is in one big object. Once it's built, the results are printed. Below is Partial output from printing the CountryList object.

=== stuff deleted ===

Country Code: USA
     District: Alabama
         city name: Birmingham, population: 242820
         city name: Huntsville, population: 158216
         city name: Mobile, population: 198915
         city name: Montgomery, population: 201568
     District: Alaska
         city name: Anchorage, population: 260283

=== stuff deleted -==

Once built, this object contains 11 countries, 350 districts each associated with its country, and 2,233 cities each associated with its district. I've implemented the solution using a concurrent query pattern that uses a factory to create a concurrent query object with the desired threading model (normal threads, callable thread pool, or runnable thread pool). Then I created a factory broker singleton class that reads the threading model, JDBC settings. and the number of connections from a properties file and invokes the proper factory to create the concurrent query object. If I use one connection, thus simulating a serialized approach, it takes about 30 seconds on average to construct this object (this doesn't include the amount of time needed to print the results). If I use two connections concurrently, the process of constructing the object takes only about 7.7 seconds. Using three connections gets the time down to 5.2 seconds. Your mileage may vary and you will eventually hit a point of diminishing returns where adding more concurrent connections won't improve performance.

Consider the CountryList domain class in Listing 8 that accepts an argument for the number of cities, builds a list of countries that have more than that number of cities, and then constructs a list of the districts in each country.

Note that the processResultSet method is defined in the ResolvableFromConcurrentQuery interface. Also, the DistrictList class, which is instantiated by the CountryList object, is a domain object that participates in concurrent queries and will invoke a CityList object, yet another concurrent query domain object. And all of this happens using threaded queries and queued queries on lists to manage them. Notice too that in this implementation that I've chosen to have the domain objects explicitly call the resolve() method of the ConcurrentQuery object rather than build a notification into the interface as the previous implementation did with the isReaped() method. The resolve() method waits for all the running threads and queued queries to complete before continuing. The tradeoff is whether or not it's more feasible to have each getter in the domain object check to be sure it's reaped or whether it's better to have the domain objects explicitly wait to be resolved.

So, in general, a concurrent query implementation will likely have a mechanism to invoke a SQL query without waiting for the SQL results, and a way to ensure that an object is properly built before it's used - either by having the business logic explicitly wait for all results to finish after invoking some concurrent queries, or by having the domain object itself recognize that it hasn't processed its SQL results and requests to wait for those results.

When To Use Concurrent Queries
I wouldn't propose using a concurrent query pattern as a general rule for all database access because of resource constraints, but I believe there are many applications that could benefit from occasional use. This pattern fits most easily with POJOs that already build and execute and process results for their own SQL queries. The following are characteristics of applications that might benefit:

  • Database and server resources are adequate and the database server isn't already under duress.
  • Your application is already using JDBC queries.
  • Your application controls when queries are run and when the results are processed (e.g., not using an external tool for building, managing, and running queries).
  • You're not having issues with the number of connections available to the database server.
If so, then it might be feasible to implement this pattern. Remember, you can always configure the number of queries allowed to run concurrently to one, essentially running your application as a regular serialized JDBC query/result model, if resource constraints become an issue.

Conclusion Such a simple pattern can be implemented in a few hours and the results might help a project over some bumpy performance issues. A few items worth noting that didn't seem to fit in anywhere else:

  • Concurrent queries don't have to be implemented using threads. Since most database servers are multithreaded themselves they usually return control back to the client after a query has been parsed and submitted while the database server works on the query. If you hold the connection then you can check for the results later without having to use threads (e.g., set a timeout to zero and check for a result). Of course, the threading approach is pretty efficient and I personally like that model better. While it's entirely possible to use JDBC and hold the connection without immediately processing the result, the de facto standard for Java/JDBC development, up to this point, has been to submit queries and process results in one operation. But, when using a language or platform whose threading package isn't trustworthy then this pattern can be implemented without threads. In a previous project, I implemented a variation of this pattern using C and ODBC without threads.
  • If you access a singleton concurrent query implementation from threaded clients then you might need to synchronize methods or blocks strategically in the concurrent query singleton.
  • I've never implemented this pattern with objects that insert, update or delete data, but I suppose it could be done. I've never implemented this pattern to participate in a transaction, but that too should be possible.
  • Besides building query-intensive objects faster, another potential use for this pattern could be in improving front-end user response time by pre-fetching data. For instance, suppose that after a user logs in to your application, his likely next choice would be to pull a list of active orders, view a list of products, or view their account settings. Concurrent queries could be used to build objects for all three potential choices immediately after the user logs in. By the time the user decides on which option to choose, the domain objects would be immediately available, or at least closer to being available than if the object started to be constructed after the user made a choice. Of course, an expiration date on the object would be in order in case the user takes 30 minutes to make a choice. Sure, you might end up building an object that you don't use, but I've had several instances where the perceived user response time was more valuable than the application resources. I don't like fast food restaurants that have my burger made before I actually order it, but I'm not as picky about my data.
For More Information
All of the sources found here, plus the source for the implementation of the list of countries example is available on sourceforge.net. Since concurrent query is more of a pattern than a packaged solution, the project on sourceforge.net is just a sample implementation intended for perusal. Sources are available for download from http://sourceforge.net/projects/concurrentquery.

The SleepyObject (ant target: run-example1) and ConcurrentSleepyObject (ant target: run-example2) are found in the article package and use a Postgres database. View the readme for instructions on creating the sleep function in Postgres. Other database servers might have a built-in function (e.g., waitfor in MS SQL) that could be substituted.

The country list example (ant target: run-ModelDriver) uses a MySQL database server. The DDL and data to create the city table is included and instructions for loading are also in the readme file.

More Stories By Andy Pardue

Andy Pardue is a senior software developer who has specialized in the medical software industry for over 15 years, 11 years as a telecommuter from his home office in Mesquite, Texas. He can be reached at: [email protected]

Comments (1) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
JDJ News Desk 12/19/06 01:30:06 PM EST

Does this sound familiar? You have a domain object, perhaps for reporting purposes, that's built from a ton of JDBC queries and it takes too long to load. Nothing else happens until this object is built, so it's become a bottleneck. Even worse, each of the queries is actually well tuned, so there isn't much to gain from modifying the queries themselves - there are just too many of them. You don't want to change (or can't change) your data model, so what can be done to alleviate this problem short of a major redesign? There are several options like caching, lazy loading, resource pooling. Another worthy option would be to implement a variation of the concurrent query pattern.

@ThingsExpo Stories
Nordstrom is transforming the way that they do business and the cloud is the key to enabling speed and hyper personalized customer experiences. In his session at 21st Cloud Expo, Ken Schow, VP of Engineering at Nordstrom, discussed some of the key learnings and common pitfalls of large enterprises moving to the cloud. This includes strategies around choosing a cloud provider(s), architecture, and lessons learned. In addition, he covered some of the best practices for structured team migration an...
Recently, REAN Cloud built a digital concierge for a North Carolina hospital that had observed that most patient call button questions were repetitive. In addition, the paper-based process used to measure patient health metrics was laborious, not in real-time and sometimes error-prone. In their session at 21st Cloud Expo, Sean Finnerty, Executive Director, Practice Lead, Health Care & Life Science at REAN Cloud, and Dr. S.P.T. Krishnan, Principal Architect at REAN Cloud, discussed how they built...
In his session at 21st Cloud Expo, Raju Shreewastava, founder of Big Data Trunk, provided a fun and simple way to introduce Machine Leaning to anyone and everyone. He solved a machine learning problem and demonstrated an easy way to be able to do machine learning without even coding. Raju Shreewastava is the founder of Big Data Trunk (www.BigDataTrunk.com), a Big Data Training and consulting firm with offices in the United States. He previously led the data warehouse/business intelligence and B...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, led attendees through the exciting evolution of the cloud. He looked at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering m...
With tough new regulations coming to Europe on data privacy in May 2018, Calligo will explain why in reality the effect is global and transforms how you consider critical data. EU GDPR fundamentally rewrites the rules for cloud, Big Data and IoT. In his session at 21st Cloud Expo, Adam Ryan, Vice President and General Manager EMEA at Calligo, examined the regulations and provided insight on how it affects technology, challenges the established rules and will usher in new levels of diligence arou...
The 22nd International Cloud Expo | 1st DXWorld Expo has announced that its Call for Papers is open. Cloud Expo | DXWorld Expo, to be held June 5-7, 2018, at the Javits Center in New York, NY, brings together Cloud Computing, Digital Transformation, Big Data, Internet of Things, DevOps, Machine Learning and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding busin...
Smart cities have the potential to change our lives at so many levels for citizens: less pollution, reduced parking obstacles, better health, education and more energy savings. Real-time data streaming and the Internet of Things (IoT) possess the power to turn this vision into a reality. However, most organizations today are building their data infrastructure to focus solely on addressing immediate business needs vs. a platform capable of quickly adapting emerging technologies to address future ...
No hype cycles or predictions of a gazillion things here. IoT is here. You get it. You know your business and have great ideas for a business transformation strategy. What comes next? Time to make it happen. In his session at @ThingsExpo, Jay Mason, an Associate Partner of Analytics, IoT & Cybersecurity at M&S Consulting, presented a step-by-step plan to develop your technology implementation strategy. He also discussed the evaluation of communication standards and IoT messaging protocols, data...
22nd International Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, and co-located with the 1st DXWorld Expo will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud ...
22nd International Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, and co-located with the 1st DXWorld Expo will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud ...
DevOps at Cloud Expo – being held June 5-7, 2018, at the Javits Center in New York, NY – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Among the proven benefits,...
@DevOpsSummit at Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, is co-located with 22nd Cloud Expo | 1st DXWorld Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait...
Cloud Expo | DXWorld Expo have announced the conference tracks for Cloud Expo 2018. Cloud Expo will be held June 5-7, 2018, at the Javits Center in New York City, and November 6-8, 2018, at the Santa Clara Convention Center, Santa Clara, CA. Digital Transformation (DX) is a major focus with the introduction of DX Expo within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive ov...
SYS-CON Events announced today that T-Mobile exhibited at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on qua...
SYS-CON Events announced today that Cedexis will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Cedexis is the leader in data-driven enterprise global traffic management. Whether optimizing traffic through datacenters, clouds, CDNs, or any combination, Cedexis solutions drive quality and cost-effectiveness. For more information, please visit https://www.cedexis.com.
SYS-CON Events announced today that Google Cloud has been named “Keynote Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Companies come to Google Cloud to transform their businesses. Google Cloud’s comprehensive portfolio – from infrastructure to apps to devices – helps enterprises innovate faster, scale smarter, stay secure, and do more with data than ever before.
SYS-CON Events announced today that Vivint to exhibit at SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California. As a leading smart home technology provider, Vivint offers home security, energy management, home automation, local cloud storage, and high-speed Internet solutions to more than one million customers throughout the United States and Canada. The end result is a smart home solution that sav...
SYS-CON Events announced today that Opsani will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Opsani is the leading provider of deployment automation systems for running and scaling traditional enterprise applications on container infrastructure.
SYS-CON Events announced today that Nirmata will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Nirmata provides a comprehensive platform, for deploying, operating, and optimizing containerized applications across clouds, powered by Kubernetes. Nirmata empowers enterprise DevOps teams by fully automating the complex operations and management of application containers and its underlying ...
SYS-CON Events announced today that Opsani to exhibit at SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California. Opsani is creating the next generation of automated continuous deployment tools designed specifically for containers. How is continuous deployment different from continuous integration and continuous delivery? CI/CD tools provide build and test. Continuous Deployment is the means by which...