Andreas Grabner About the Author

Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

Improving Message Queue Throughput tenfold by choosing the right XML Parser

Does your application architecture include Message Queues to feed work items to backend batch processing such as “Update Product Inventory” or “Send out Notification Emails”? Message Queues work well in these use cases as they decouple your system components and allow your backend processing to asynchronously process requests. This decoupling allows the queue to grow under heavy load allowing the background job to catch up when there is less load on the system resulting in an evenly distributed work load throughout the day.

This is a big advantage over synchronous processing, but what happens if your background jobs can’t keep up with all incoming requests? This is a more complex problem than it might seem at first!  For example, you could add more background processes, reduce the number of items flowing into the queue, optimize your background job implementation, and many other possibilities. Where and how should you start optimizing?

This post follows an interesting problem we worked through with one of our customers.  Their eCommerce web application was adding about 40 inventory item update messages per minute into a processing queue. Unfortunately only six messages per minute could actually be processed by the background job, which caused a huge backlog and lead to badly outdated inventory information in the web application and internal systems.

Instead of adding additional background processing jobs they analyzed the performance of the background process itself, identifying the WebSphere XML parser as performance bottleneck. After replacing it with Apache Xalan, they increased performance by 1000%, and could now process at least 60 messages per minute! The following chart shows the throughput of two queues in their system and the jump in de-queuing actions when they switched to the alternative XML Parser at about 10:20AM for the first queue worker process and at about 10:45AM also applying the changes to the second background process working on the second queue:

After switching to Apache Xalan the batch processing could process 10 times more messages from the queue

After switching to Apache Xalan the batch processing could process 10 times more messages from the queue

How they Analyzed the Bottleneck in Minutes

To understand why the background job was handling messages so slowly, they began by analyzing the executions of each individual message that was pulled out of the queue. It turned out that the execution time of each individual job took between 8 and 212 seconds. On average, a single message took about 10 seconds to process, resulting in the slow average throughput of 6 messages per minute. The following screenshot shows a selection of these background jobs:

Looking at each individual job execution makes it clear that 99% of the time was spent in I/O and took an average of 60 seconds

Looking at each individual job execution makes it clear that 99% of the time was spent in I/O and took an average of 60 seconds

Looking at the methods that contribute to these PurePaths made it obvious that all this time is spent in the IBM XML Parser when running through the XSLTCompiler:

99% of the time is spent in the compile method invoked by the customer’s transformXml implementation – that’s the hotspot to focus on

99% of the time is spent in the compile method invoked by the customer’s transformXml implementation – that’s the hotspot to focus on

Looking at the performance contribution of that XML Transformation over a longer period of time also validates that this is the main reason for the slow batch processing. It impacts every single message in the queue:

Proof that the extremely poor performance of TransformXml is not an occasional problem, but instead occurs for every message processed

Proof that the extremely poor performance of TransformXml is not an occasional problem, but instead occurs for every message processed

After further investigation they discovered that the bad performance was in part caused by over 37000 Java Exceptions thrown within the execution trace of the IBM XML Parser:

More than 37000 exceptions are thrown while the IBM XML Parser processes the XML Content causing most of the performance overhead

More than 37000 exceptions are thrown while the IBM XML Parser processes the XML Content causing most of the performance overhead

Solution: Moving to a Different XML Implementation

After seeing the performance of the standard XML Parser shipped with IBM WebSphere they moved to Apache Xalan. This improved message processing throughput from 6 to 60 per minute – a 1000% increase! As explained in the introduction paragraph. They have two queues in total. They applied the software change to the first background process working on the first queue at about 10:20AM seeing a jump from about 6 messages to 60 messages per minute. After about 20 minutes they also applied that change to the second queue which is reflected by the second jump in throughput as shown in the graph.

Significantly higher message throughput after switching XML Parsers, showing a 10X performance improvement

Significantly higher message throughput after switching XML Parsers, showing a 10X performance improvement

Want to share your stories?

If you find stories like this useful and if you want to share your own stories with the fellow readers we have on this blog let us know. Either post a comment on this blog or send me an email: andreas.grabner@compuware.com. There are also more blog posts like this that I’ve done in the past. So – if you found this one useful and interesting check out the following posts as well: Optimize Load Balancers, Analyzing Stuck Transactions or How to Triple Application Throughput.

Comments

  1. I cannot comment on the performance of IBM XML parser, I used it long time ago.
    But here is something unclear from the description of the problem. The stack trace clearly shows that the problem is with compiling XSLT. Does the customer actually send XSLT as part of each message? Or the XSLT is static and they just do not know that it is possible to compile XSLT once and then reuse it? If it is the latter then I guess they can improve the performance even further. And even switch back to IBM parser without any loss of performance.

    As for war stories, look at one of mine: http://lea-ka.blogspot.nl/2012/07/how-good-intentions-can-grind-ssl-to.html

  2. What you’re testing is an XSLT compilation that the Xalan engine isn’t even performing. This is an optimization that shows its benefit on the second execution of the same XSLT, which you’re either doing wrong or you use a different XSLT each time and should turn the compilation off. I have seen this problem at other customers before. Your collected performance data is correct, but your conclusion and fix is wrong.

    • Thanks Gerhard for that input. I’ve forwarded all these comments to the customer. As I said in the article. They solved the problem by switching to the different parser – not saying that this is the only solution out there. but – it worked well for them out of the box and immediately solved their problem. Andi

  3. This is a great post by the way but I agree with Leonid why the XSLT is being compiled with every message? Does the message format change with every message sent? Also how did you change the parser?

Comments

*


8 + = thirteen