Welcome!

Weblogic Authors: Yeshim Deniz, Elizabeth White, Michael Meiner, Michael Bushong, Avi Rosenthal

Related Topics: Java IoT, IBM Cloud, Weblogic

Java IoT: Article

Java Feature — Concurrent Queries

A pattern for improving database query performance

Does this sound familiar? You have a domain object, perhaps for reporting purposes, that's built from a ton of JDBC queries and it takes too long to load. Nothing else happens until this object is built, so it's become a bottleneck. Even worse, each of the queries is actually well tuned, so there isn't much to gain from modifying the queries themselves - there are just too many of them. You don't want to change (or can't change) your data model, so what can be done to alleviate this problem short of a major redesign? There are several options like caching, lazy loading, resource pooling. Another worthy option would be to implement a variation of the concurrent query pattern.

Concurrent queries are fairly simple to implement and even simpler to describe. Rather than serializing a set of queries, one after the other, waiting for one to complete before the next begins, one would actually use threads to run sets of independent queries simultaneously. Now, using threading for database I/O might sound daunting or ill-advised, but the Java threads package is one of the best, if not the best, I've had the pleasure of working with. Plus, the new concurrency utilities supplied with Java 5 make using threading for database I/O much more feasible. The UML-ish diagram in Figure 1 attempts to provide a pictorial representation of what I hope to explain in the following paragraphs.

Suppose a domain object is built from 200 queries that can each run from execution to processing results in 100 milliseconds on average. Now 100ms isn't too bad for a query, but stacked end to end, the result would be up to 20 seconds to build an object. Ouch. Now, suppose you could run up to five of these queries concurrently at any one time. In an example that will be described later, I was able to take a similar scenario and increase performance from 20+ seconds to build an object to 5+ seconds. First, though, I'll describe a simpler problem scenario and then implement a solution using concurrent queries, making use of JDBC, connection pooling, and Java 5 thread pools in an effort to demonstrate the type of performance improvements this pattern might render. Later on, I'll cover another, more complicated implementation of an object that's built from several actual database queries. Note: Full source code for these sample implementations is available on sourceforge.net.

A Serialized Baseline Sample Application
First, let's look at the usual case for building objects from database queries. For the following example, a user-defined sleep function was created in a Postgres database (using Postgres magic that I found on the Internet) that could be called as select sleep(N) where N is the number of seconds to sleep. After sleeping for the seconds indicated in the argument the number of seconds slept is returned back as a result. We'll do this with a class called SleepyObject in Listing 1.

This is a fairly standard JDBC query class that selects the sleep(N) function for the number of seconds desired and processes the ResultSet, which simply contains an integer indicating the number of seconds requested to sleep. So, if you "select sleep(1)," the result of that query will be 1. Effectively, the sleep function mimics a query that takes N seconds to complete. In the serialized example (Example1.java), an array of five SleepyObjects is created. Upon creation, each SleepyObject selects the sleep(N) function via JDBC and processes the result. This example creates an array of SleepyObjects then iterates over that array, printing out the return value from the call to the sleep function. Then, the number of seconds it took to execute the entire exercise is printed. Since this sample creates JDBC objects in the usual way, the second SleepyObject isn't created until the first object is fully created (e.g., finished with its sleep(N) query), and so on. So, in this example, since each SleepyObject sleeps 2, 1, 2, 2, and 1 seconds, respectively, the entire application must take at least eight seconds plus overhead to run, as indicated in the output below. This is Example1.java - a serialized example.

package net.sourceforge.concurrentQuery.article.serialized;

import java.sql.SQLException;

public class Example1 {

   public Example1() {}
   public static void main(String[] args) throws SQLException {

     long start = System.currentTimeMillis();

     SleepyObject[] sleepyObjects = { new SleepyObject(2),
          new SleepyObject(1),
          new SleepyObject(2),
          new SleepyObject(2),
          new SleepyObject(1)
          };
     int i = 1;
     for (SleepyObject sleepyObject : sleepyObjects) {
       System.out.println("SleepyObject " + i++ + " returned "
       + sleepyObject.getValue());
     }
     long end = System.currentTimeMillis();
     System.out.println("took: " + new Double(end - start) / 1000 + " seconds");

   }
}

The following is output from Example1.java.

run-example1:
    [java] query is: select sleep(2)
    [java] query is: select sleep(1)
    [java] query is: select sleep(2)
    [java] query is: select sleep(2)
    [java] query is: select sleep(1)
    [java] SleepyObject 1 returned 2
    [java] SleepyObject 2 returned 1
    [java] SleepyObject 3 returned 2
    [java] SleepyObject 4 returned 2
    [java] SleepyObject 5 returned 1
    [java] took: 8.54 seconds

An Implementation Using Concurrent Queries
Next, we'll build on the same example, but this time we'll use a class called ConcurrentSleepyObject to replace SleepyObject. The ConcurrentSleepyObject will use a singleton implementation of the concurrent query pattern to invoke queries and reap the results as seen in Example2.java below.

package net.sourceforge.concurrentQuery.article.concurrent;

import java.sql.SQLException;

public class Example2 {

   public Example2() {}
   public static void main(String[] args) throws SQLException {

     long start = System.currentTimeMillis();

     ConcurrentSleepyObject[] concurrentSleepyObjects = {
          new ConcurrentSleepyObject(2),
          new ConcurrentSleepyObject(1),
          new ConcurrentSleepyObject(2),
          new ConcurrentSleepyObject(2),
          new ConcurrentSleepyObject(1)
          };
     int i = 1;
     for (ConcurrentSleepyObject concurrentSleepyObject :
       concurrentSleepyObjects) {
       System.out.println("ConcurrentSleepyObject " + i++
       + " returned " + concurrentSleepyObject.getValue());
     }
     long end = System.currentTimeMillis();
     System.out.println("took: " + new Double(end - start) / 1000 + " seconds");

   }
}

Both Example1 and Example2 build an array containing five objects, each invokes a database sleep for the same amount of time. However, in this second example, by using an implementation of the concurrent query pattern, the same JDBC calls can be executed in less than half the time (2.86 seconds versus 8.54). This is because we have five queries running at once rather than one at a time. In this example, each of the five queries is run and resolved within its own thread. So, rather than the total execution time being the sum of all queries plus overhead, as in the first example, the total execution time is roughly the amount of time it takes for the longest query to run plus overhead. Here is the output from Example2.java shown below.

run-example2:
    [java] query is: select sleep(2)
    [java] query is: select sleep(1)
    [java] query is: select sleep(2)
    [java] query is: select sleep(2)
    [java] query is: select sleep(1)
    [java] ConcurrentSleepyObject 1 returned 2
    [java] ConcurrentSleepyObject 2 returned 1
    [java] ConcurrentSleepyObject 3 returned 2
    [java] ConcurrentSleepyObject 4 returned 2
    [java] ConcurrentSleepyObject 5 returned 1
    [java] took: 2.86 seconds

To accomplish this, a class called ConcurrentQueryThreadImpl.java was created as a singleton class that encapsulates:

  1. The number of active queries that can run at once (five for purposes of this example).
  2. A ConcurrentHashMap to hold a list of running query threads and a reference to the domain object interface to be used to reap the results of the query. ConcurrentHashMap is a thread-safe HashMap available in the java.util.concurrent package.
  3. A second ConcurrentHashMap to hold a list of queries and domain object interfaces of queries that couldn't be immediately submitted because the maximum number of threads (five) was already running.
  4. The needed JDBC code to execute the queries.
The code fragment in Listing 2 shows the initialization of the ConcurrentQueryThreadImpl class.

The interface CanResolveAConcurrentQuery, referenced in the ConcurrentQueryThreadImpl class, is used in the ConcurrentHashMaps and simply defines two methods that must be implemented by a participant in a concurrent query (e.g., ConcurrentSleepyObject), one to process the SQL results and another (isReaped())method that the ConcurrentQuery implementation can use to indicate to the object that it has processed its SQL results and is ready to go. Below is CanResolveAConcurrentQuery.java.

package net.sourceforge.concurrentQuery.article.concurrent;

import java.sql.ResultSet;
import java.sql.SQLException;

public interface CanResolveAConcurrentQuery {
    boolean processResultSet(ResultSet rs) throws SQLException;
    void setReaped(boolean isReaped);
}

By implementing this interface, the ConcurrentSleepyObject can participate in concurrent queries. Notice that in the getValue() method the query object needs to make sure that it has been "reaped" (e.g., it either processed its results or threw an SQLException) and if not, the object must call the waitForAllQueriesToComplete() method of the ConcurrentQueryThreadImpl singleton, which submits and processes all outstanding queries. See getValue() method of ConcurrentSleepyObject shown in the following.

public class ConcurrentSleepyObject implements CanResolveA-ConcurrentQuery {

...
   public int getValue() throws SQLException {
     if (!reaped) {
       ConcurrentQueryThreadImpl.getInstance().waitForQueriesToComplete();
     }
     return value;
   }
...

This is done so that the object can be sure that its results have been processed before it can be used. The waitForAllQueriesToComplete() method won't return until all running and queued queries are finished. This way, our object can be sure that its results have been processed before continuing. A better option, though, would be to assign a token, or cookie, to each participant in a concurrent query that can be used by the object to ensure that results are ready. This way, if the results aren't ready yet, the object won't have to wait for all the other queries to finish, but could be notified when its query has completed, perhaps moving it to the front of the queue, if necessary. To keep things simple, I opted for the main -strain-and-brute-force approach of waiting for all queries to finish. The complete source for the ConcurrentSleepyObject class is in Listing 3.

Details of the ConcurrentQuery Implementation
As mentioned, this implementation uses two lists to manage queries. The first list is the threads that are currently running SQL queries and can't surpass the configured value for the application. Since each thread corresponds to a JDBC connection, you want to be careful not to set this value too high. I've used five for this example. The second list is the queued queries that haven't been able to run because the running thread list was full when the query was submitted via the runQuery() method. See Listing 4.


More Stories By Andy Pardue

Andy Pardue is a senior software developer who has specialized in the medical software industry for over 15 years, 11 years as a telecommuter from his home office in Mesquite, Texas. He can be reached at: [email protected]

Comments (1) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
JDJ News Desk 12/19/06 01:30:06 PM EST

Does this sound familiar? You have a domain object, perhaps for reporting purposes, that's built from a ton of JDBC queries and it takes too long to load. Nothing else happens until this object is built, so it's become a bottleneck. Even worse, each of the queries is actually well tuned, so there isn't much to gain from modifying the queries themselves - there are just too many of them. You don't want to change (or can't change) your data model, so what can be done to alleviate this problem short of a major redesign? There are several options like caching, lazy loading, resource pooling. Another worthy option would be to implement a variation of the concurrent query pattern.

@ThingsExpo Stories
"MobiDev is a software development company and we do complex, custom software development for everybody from entrepreneurs to large enterprises," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
I think DevOps is now a rambunctious teenager - it's starting to get a mind of its own, wanting to get its own things but it still needs some adult supervision," explained Thomas Hooker, VP of marketing at CollabNet, in this SYS-CON.tv interview at DevOps Summit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
Recently, WebRTC has a lot of eyes from market. The use cases of WebRTC are expanding - video chat, online education, online health care etc. Not only for human-to-human communication, but also IoT use cases such as machine to human use cases can be seen recently. One of the typical use-case is remote camera monitoring. With WebRTC, people can have interoperability and flexibility for deploying monitoring service. However, the benefit of WebRTC for IoT is not only its convenience and interopera...
Cloud-enabled transformation has evolved from cost saving measure to business innovation strategy -- one that combines the cloud with cognitive capabilities to drive market disruption. Learn how you can achieve the insight and agility you need to gain a competitive advantage. Industry-acclaimed CTO and cloud expert, Shankar Kalyana presents. Only the most exceptional IBMers are appointed with the rare distinction of IBM Fellow, the highest technical honor in the company. Shankar has also receive...
It is of utmost importance for the future success of WebRTC to ensure that interoperability is operational between web browsers and any WebRTC-compliant client. To be guaranteed as operational and effective, interoperability must be tested extensively by establishing WebRTC data and media connections between different web browsers running on different devices and operating systems. In his session at WebRTC Summit at @ThingsExpo, Dr. Alex Gouaillard, CEO and Founder of CoSMo Software, presented ...
WebRTC is great technology to build your own communication tools. It will be even more exciting experience it with advanced devices, such as a 360 Camera, 360 microphone, and a depth sensor camera. In his session at @ThingsExpo, Masashi Ganeko, a manager at INFOCOM Corporation, introduced two experimental projects from his team and what they learned from them. "Shotoku Tamago" uses the robot audition software HARK to track speakers in 360 video of a remote party. "Virtual Teleport" uses a multip...
Business professionals no longer wonder if they'll migrate to the cloud; it's now a matter of when. The cloud environment has proved to be a major force in transitioning to an agile business model that enables quick decisions and fast implementation that solidify customer relationships. And when the cloud is combined with the power of cognitive computing, it drives innovation and transformation that achieves astounding competitive advantage.
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, discussed the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
IoT is rapidly becoming mainstream as more and more investments are made into the platforms and technology. As this movement continues to expand and gain momentum it creates a massive wall of noise that can be difficult to sift through. Unfortunately, this inevitably makes IoT less approachable for people to get started with and can hamper efforts to integrate this key technology into your own portfolio. There are so many connected products already in place today with many hundreds more on the h...
When shopping for a new data processing platform for IoT solutions, many development teams want to be able to test-drive options before making a choice. Yet when evaluating an IoT solution, it’s simply not feasible to do so at scale with physical devices. Building a sensor simulator is the next best choice; however, generating a realistic simulation at very high TPS with ease of configurability is a formidable challenge. When dealing with multiple application or transport protocols, you would be...
Detecting internal user threats in the Big Data eco-system is challenging and cumbersome. Many organizations monitor internal usage of the Big Data eco-system using a set of alerts. This is not a scalable process given the increase in the number of alerts with the accelerating growth in data volume and user base. Organizations are increasingly leveraging machine learning to monitor only those data elements that are sensitive and critical, autonomously establish monitoring policies, and to detect...
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, provided an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settl...
In his session at @ThingsExpo, Dr. Robert Cohen, an economist and senior fellow at the Economic Strategy Institute, presented the findings of a series of six detailed case studies of how large corporations are implementing IoT. The session explored how IoT has improved their economic performance, had major impacts on business models and resulted in impressive ROIs. The companies covered span manufacturing and services firms. He also explored servicification, how manufacturing firms shift from se...
DevOpsSummit New York 2018, colocated with CloudEXPO | DXWorldEXPO New York 2018 will be held November 11-13, 2018, in New York City. Digital Transformation (DX) is a major focus with the introduction of DXWorldEXPO within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of bus...
The Jevons Paradox suggests that when technological advances increase efficiency of a resource, it results in an overall increase in consumption. Writing on the increased use of coal as a result of technological improvements, 19th-century economist William Stanley Jevons found that these improvements led to the development of new ways to utilize coal. In his session at 19th Cloud Expo, Mark Thiele, Chief Strategy Officer for Apcera, compared the Jevons Paradox to modern-day enterprise IT, examin...
IoT solutions exploit operational data generated by Internet-connected smart “things” for the purpose of gaining operational insight and producing “better outcomes” (for example, create new business models, eliminate unscheduled maintenance, etc.). The explosive proliferation of IoT solutions will result in an exponential growth in the volume of IoT data, precipitating significant Information Governance issues: who owns the IoT data, what are the rights/duties of IoT solutions adopters towards t...
Amazon started as an online bookseller 20 years ago. Since then, it has evolved into a technology juggernaut that has disrupted multiple markets and industries and touches many aspects of our lives. It is a relentless technology and business model innovator driving disruption throughout numerous ecosystems. Amazon’s AWS revenues alone are approaching $16B a year making it one of the largest IT companies in the world. With dominant offerings in Cloud, IoT, eCommerce, Big Data, AI, Digital Assista...
Organizations planning enterprise data center consolidation and modernization projects are faced with a challenging, costly reality. Requirements to deploy modern, cloud-native applications simultaneously with traditional client/server applications are almost impossible to achieve with hardware-centric enterprise infrastructure. Compute and network infrastructure are fast moving down a software-defined path, but storage has been a laggard. Until now.
Digital Transformation is much more than a buzzword. The radical shift to digital mechanisms for almost every process is evident across all industries and verticals. This is often especially true in financial services, where the legacy environment is many times unable to keep up with the rapidly shifting demands of the consumer. The constant pressure to provide complete, omnichannel delivery of customer-facing solutions to meet both regulatory and customer demands is putting enormous pressure on...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...