All aboard for more efficient Web applications

The Train architecture dynamically batches user requests to improve server performance

1 2 Page 2
Page 2 of 2
 

//Dispatcher.java package train;

import java.util.*; public class Dispatcher implements Runnable { private List mCurrentJobBatch = new ArrayList(); //Batch container private int mJobBatchMaxSize = 5; //Maximum number of the jobs in the batch private int mIntervalTime = 50; //Maximum time to wait before batch execution public synchronized void AddJob(Job job) { mCurrentJobBatch.add(job); if (mCurrentJobBatch.size() == mJobBatchMaxSize) { ProcessJobBatch(); //If batch is full, execute } }

private synchronized void ProcessJobBatch() { if (mCurrentJobBatch.size() == 0) { return; } Worker worker = new Worker(mCurrentJobBatch); Thread ht = new Thread(worker); ht.start(); mCurrentJobBatch = new ArrayList(); }

public void run() { try { while (true) { Thread.sleep(mIntervalTime); ProcessJobBatch(); // Each mIntervalTime milliseconds execute batch } } catch (InterruptedException ex) { } } }

The Dispatcher lives in an independent thread and becomes a container of all concurrent Jobs. The batch executes under two conditions, whichever comes first:

  1. mIntervalTime milliseconds have passed since the previous execution under this condition and at least one Job is in the batch
  2. The batch's size has reached the mJobBatchMaxSize

If the value of mJobBatchMaxSize is set to 1, the TrainServlet devolves to the ClassicServlet. This is an important notion for dynamic batching, which is explained later in this article.

The final piece of the solution is a Worker class:

 

//Worker.java package train;

import java.util.*; import java.sql.*;

public class Worker implements Runnable { List mJobs; Map mJobMap; //Helper member for mapping the jobs protected Worker(List jobs) { mJobs = jobs; mJobMap = new HashMap(); }

private void Process() { boolean isError = false; StringBuffer sqlBuff = new StringBuffer( "Select id,name ,descr,views from trentry where id in "); StringBuffer whereClause = CreateWhereClause(); // sqlBuff.append(whereClause); /* Now SQL statement is fully formed and looks like:

Select id,name ,descr,views from trentry where id in (3343,22222,5555). This will allow us to fetch several user requests in one shot */ Connection connection = null; Statement stmt = null; try { connection = Util.getDBConnection(); stmt = connection.createStatement(); ResultSet rs = stmt.executeQuery(sqlBuff.toString()); System.out.println(sqlBuff.toString()); Map result = new HashMap(); while (rs.next()) { int id = rs.getInt("ID"); String name = rs.getString("NAME"); String descr = rs.getString("DESCR"); int views = rs.getInt("VIEWS"); Job job = (Job) mJobMap.get(String.valueOf(id)); //Populate instance of the Job with data retrieved from database job.mName = name; job.mDescr = descr; job.mViews = views; } String sqlViewStr = "update trentry set views = views+1 where id in " + whereClause; // The same trick for Update statement stmt.executeUpdate(sqlViewStr); } catch (SQLException ex) { isError = true; } finally { try {

if (stmt != null) { stmt.close(); } if (connection != null) { if (isError) { connection.rollback();

} else { connection.commit(); } connection.close(); } } catch (SQLException ex1) { isError = true; } } FinishJobs(isError); return; }

private StringBuffer CreateWhereClause() { StringBuffer clause = new StringBuffer("("); for (int i = 0; i < mJobs.size(); i++) { Job job = (Job) mJobs.get(i); String id = job.mID; if (i != 0) { clause.append(","); } clause.append(id); mJobMap.put(id, job); } clause.append(")"); return clause; }

private void FinishJobs(boolean isError) { for (int i = 0; i < mJobs.size(); i++) { Job job = (Job) mJobs.get(i); if (isError) { job.mHasFailed=true; //Rudimentary error handling } job.mJobThread.interrupt(); /* Wake up the TrainServlet to deliver the page to the browser */ } }

public void run() { Process(); } }

Class Worker creates SQL statements, grouping Jobs for the single trip to the database, populates the Jobs' instances with information retrieved from the database, and wakes the Job threads.

Train servlet performance

I used the same JMeter application with identical configuration to test the TrainServlet. Did we waste our time employing the Train paradigm? Apparently not, as shown in Figure 3.

Figure 3. TrainServlet performance. Click on thumbnail to view full-sized image.

We did at least 15 times better! Results may vary depending on the hardware, software, and specific implementation, but it's difficult to deny the improvement's significance.

This article's graphs describe a simulated environment of 50 concurrent users and are quite static. More realistic and therefore interesting data could be obtained by dynamic regression analysis. For this type of analysis, I simulated the different number of concurrent users and measured performance in milliseconds per request for both servlets. See Figure 4 for the results.

Figure 4. Comparative dynamic performance. Click on thumbnail to view full-sized image.

The results are almost too good to be true. Contrary to the ClassicServlet, the TrainServlet does not manifest the significant performance regression under the heavy external traffic. Apparently, with three concurrent users, the graph shows that the ClassicServlet response time is slightly better. I call this phenomenon the low-traffic penalty. It appears when there is only one user request. In this case, we should wait for the next scheduled trip. In addition, batching offers no benefits because the batch will have only one request.

Final thoughts

The solution described in this article is intentionally simplistic for the sake of clarity. Some obvious improvements could be made.

Predictive traffic analysis and dynamic batching

We set the value of the mIntervalTime to 50 milliseconds. Let's assume that traffic is low and each new request comes once every 100 milliseconds. Thus, we experience an average delay of 25 milliseconds per request. How can we avoid this kind of low-traffic penalty? The solution lies in the adaptive dynamic batching based on the predictive traffic analysis. In other words, we change the maximum number of jobs in the batch depending on the number of user requests in the previous sessions. The simplest algorithm is as follows:

  1. Remember the number of requests served within the last 50 milliseconds
  2. If this number is less than 5, set the value of mJobBatchMaxSize to 1, otherwise to 5

As I discussed above, when mJobBatchMaxSize is 1, then each request is served immediately and TrainServlet behaves like ClassicServlet.

Dispatcher for multiple commands

The industrial Web application server is the home for many commands, not just one. The real life Dispatcher and Worker should be able to serve all of them. Each command should have its own corresponding Worker class. Dispatcher should be implemented as a singleton. However, there are two probable flavors of the Dispatcher design:

  1. Each command has a corresponding Dispatcher class. If after some specific time, there are no requests to serve, the Dispatcher thread exits. We just do not want dozens of idle threads to consume system resources. So, instead of the Dispatcher instantiation in the servlet method init(), we facilitate a call: Dispatcher.getIntance().AddJob(job).
  2. Only one Dispatcher class manages different batches from different commands and maps them to the corresponding workers through the command name. This time, invocation looks like: Dispatcher.getInstance().AddJob(this.getClass().getName(),job).

Train pattern as JDBC proxy

From a practical point of view, it is difficult to upgrade working industrial applications or abandon the traditional mind set to build a new one. What if we build some JDBC proxy such that JDBC calls from the application are almost identical? The same could be done with a wrapper around Enterprise JavaBeans. This approach is promising because it can relate to any general application dealing with concurrent users accessing network resources or databases rather than Web applications. However, it's the subject of another article.

Conclusion

In this article, we have designed a new paradigm, called Train, for developing efficient Web applications. We built a real application based on the Train pattern and proved its performance improvement by a factor of 10 under the stress test. Usage of the Train architecture is not limited to Web applications and, by reducing hardware and software requirements, could benefit any application that deals with concurrent users accessing network resources or databases.

Special thanks to my friends and colleges Mark Jackson, Sander Berents, Tom Griffin, and Eric Van Stegeren for their support and valuable contributions to this article.

Edward Salatovka is a senior principal engineer at patent information provider Thomson Delphion. Over the last 12 years he has worked on both big scale projects like IBM's WebSphere Commerce and innovative solutions for successful hi-tech startups as a core developer, technical lead, and architect.

Learn more about this topic

This story, "All aboard for more efficient Web applications" was originally published by JavaWorld.

Copyright © 2005 IDG Communications, Inc.

1 2 Page 2
Page 2 of 2