Wolfgang Gottesheim About the Author

Wolfgang has several years of experience as a software engineer and research assistant in the Java enterprise space. Currently he contributes to the strategic development of the Compuware dynaTrace enterprise solution as a Technology Strategy in the Compuware APM division’s Center of Excellence. He focuses on monitoring and optimizing applications in production. Find him at @gottesheim

Continuous Performance Validation in Continuous Integration Environments

Each year, the holiday shopping season brings a surge in awareness for website performance and scalability issues. While these items should obviously take an important spot on your roadmap during the whole year, the interesting question is why familiar performance problems keep impacting customer-facing web sites over and over again. After all, we’re all well aware of the most frequent problem patterns that you might face, ranging from front-end issues (like loading too many resources and misconfigured cache headers) to server-side aspects (like improper database access and missing resources). Furthermore, we all know that the farther back in the application lifecycle a bug is discovered, the more expensive fixing this bug gets – so while you’d think it’s important to keep them from making it into production, recent high profile failures like healthcare.gov serve as a reminder that this is often not the case.

I don't always test my code

One reason that we keep hearing again and again is the lack of proper testing before deploying into production. This is something we can all probably relate to, since many development projects that face a strict deadline (like getting your new website out before the holiday shopping season starts) still regard testing as negotiable – “if there’s time we’ll do it, but getting the release out is what counts”. But there’s one kind of testing that caught on over the last years and holds up in stressful times: automated unit and integration testing in a continuous integration environment. The big benefit here is that developers continuously get feedback if the basic functionality still works, and since it’s automated, there’s only little manual effort required after writing the test.

But while ensuring that the application you’re developing actually works is a great start, an important aspect usually not covered is the performance of the tested functionality as well as adhering to architectural rules. This requires knowing how the application is behaving internally to allow for an estimate if the code fulfills performance, scalability and architectural requirements. For example, for an online shopping site a unit test would check if the backend service correctly handles purchasing the content in your shopping cart. This would include backend tasks like verifying and billing your credit card, creating the actual order and maybe updating inventory status. Let’s start by looking at the data you would get from a CI server, test executions with their respective status on a per-build level:

Functional testing helps to ensure that defects are quickly identified and resolved

Functional testing helps to ensure that defects are quickly identified and resolved

We see at the first glance that in build 18 our purchasing functionality was broken, but in build 19 it was back to working again as the developer tried to fix the functional problem based on his best knowledge and intention. While this might seem like great news at first, let’s take a second look with additional under-the-hood information available:

Looking at metrics other than the test status reveals the potential side effects of the fix.

Looking at metrics other than the test status reveals the potential side effects of the fix.

While the five exceptions we see for testPurchase in build 18 are the likely cause for this test’s failure, the fix that allowed this test to get back to OK in build 19 caused 75 database statements to be executed – more than six times the number we saw before! By detecting this architectural regression as early as in the next test execution, we avoid dragging it across different builds and milestones until it finally shows up in an eventual load test – or worse, when the production database server is brought down by the additional load.

Tip #1: Using Unit Tests for Architectural Validation

Once we have the ability to quickly identify and fix architectural regressions, we can set our aim on measuring runtime performance (like response time or CPU time) in our daily test suites. Monitoring runtime on a unit test level doesn’t make sense as we typically see high volatility in these results, as the duration of a single unit test is usually rather low and, thus, environmental factors (like concurrent builds on the same machine) can have high impact.

The following screenshot shows an automatically detected architectural regression by analyzing the number of DB Executions for each unit test execution of every build. Seeing the jump in DB Executions by factor 35 from one build to the next makes it easy fixing this architectural regressions because only the recent code changes can cause this:

Monitoring architectural metrics such as DB Executions, Thrown Exceptions, # of Log Outputs for every executed unit test allows us to automatically identify architectural regressions.

Monitoring architectural metrics such as DB Executions, Thrown Exceptions, # of Log Outputs for every executed unit test allows us to automatically identify architectural regressions.

Tip #2: Integration and Performance Unit Tests for Performance Validation

Integration tests or small scale performance unit tests that cover multiple components are better suited as they usually take longer than several hundred milliseconds, so that we can calculate a baseline and provide meaningful alerting once transactions take longer than expected.

The following screenshot shows an automatic detected performance regression caused by a switch to a new java framework version which introduced some performance side effects. Seeing this jump on a per build basis makes it easy to identify what has caused the changed behavior and allows you to quickly fix it:

For longer running Integration or Performance Unit Tests we can automatically identify performance regressions introduced by code or environmental changes.

For longer running Integration or Performance Unit Tests we can automatically identify performance regressions introduced by code or environmental changes.

Making Metrics Visible

For our existing Compuware APM/dynaTrace customers we recommend looking at our new dynaTrace Test Automation plugin for Jenkins (APM Community login required). The plugin pulls the test automation results from dynaTrace into the build results published in your Jenkins environment. A single dashboard serves developers by displaying the current build status not only from a functional perspective, but also including performance – and violations of architectural or performance baselines can even fail your builds.

Jenkins dashboard for easyTravel sample project including dynaTrace test automation metrics

Jenkins dashboard for easyTravel sample project including dynaTrace test automation metrics

 Making performance a key part of your continuous integration practice helps you to avoid architectural and performance issues and makes you confident that your software is not only working correctly, but will do so without requiring the Ops team standing by to hit reset buttons.

Comments

*


5 + nine =