new relic | Load Impact Blog

This post was originally written for:

………..

With many, if not most, applications, it is common that a very small part of the code is responsible for nearly all of the application response time. That is, the application will spend almost all of its time executing a very minor part of the code base.

In some cases, this small part of code has been well optimized and the application is as fast as can reasonably be expected. However, this is likely the exception rather than the rule.

It might also be that the real delay happens in external code – in a third-party application depended on.

Regardless of where a performance bottleneck lies, half of the work in fixing it (or working around it) is usually spent identifying where it’s located.

Step 1: Understand how your backend is being utilized.

One of the first things you must do to identify your pain points is to understand how your backend is being utilized.

For example, if your application backend functionality is exposed through a public API that clients use, you will want to know which API functions are being called, and how often and at what frequency they are being called.

You might also want to use parameter data for the API calls that are similar to what the application sees during real usage.

Step 2: Combine performance testing with performance monitoring to locate bottlenecks.

The second, and more important, step to take is to combine performance testing with performance monitoring in order to nail down where the problems lie.

When it comes to performance testing, it’s usually a matter of experimenting until you find the point at which things either start to fall apart, often indicated by transaction times suddenly increasing rapidly, or just stop working.

When you run a test and reach the point at which the system is clearly under stress, you can then start looking for the bottleneck(s). In many cases, the mere fact that the system is under stress can make it a lot easier to find the bottlenecks.

If you know or suspect your major bottlenecks to be in your own codebase, you can use performance monitoring tools to find out exactly where the code latency is happening.

By combining these two types of tools – performance testing and performance monitoring – you will be able to optimize the right parts of the code and improve actual scalability.

Let’s use an example to make this point clear.

Let’s say you have a website that is accessed by users using regular web browsers. The site infrastructure consists of a database (SQL) server and a web server. When a user accesses your site, the web server fetches data from the database server, then it performs some fairly demanding calculations on the data before sending information back to the user’s browser.

Now, let’s say you’ve forgotten to set-up an important database table index in your database – a pretty common performance problem experienced with SQL databases. In this case, if you only monitor your application components – the physical servers, the SQL server and the web server – while a single user is accessing your site, you might see that the database takes 50 ms to fetch the data and the calculations performed on the web server take 100 ms. This may lead you to start optimizing your web server code because it looks as if that is the major performance bottleneck.

However, if you submit the system to a performance test which simulates a large number of concurrent users with, let’s say, ten of those users loading your web site at exactly the same time, you might see that the database server now takes 500 ms to respond, while the calculations on the web server take 250 ms.

The problem in this example is that your database server has to perform a lot of disk operations because of the missing table index, and those scale linearly (at best) with increased usage because the system has only one disk.

The calculations, on the other hand, are each run on a single CPU core, which means a single user will always experience a calculation time of X (as fast as a single core can perform the calculation), but multiple concurrent users will be able to use separate CPU cores (often 4 or 8 on a standard server) and experience the same calculation time, X.

Another potential scalability factor could be if calculations are cached, which would increase scalability of the calculations. This would allow average transaction times for the calculations to actually decrease with an increased number of users.

The point of this example is that, until you submit a system to real heavy traffic, you have really no idea how it will perform when lots of people are using the system.

Put bluntly, optimizing the parts of the code you identified as performance bottlenecks when being monitored may end up being a total waste of time. It’s a combination of monitoring and testing that will deliver the information you need to properly scale.

By: Ragnar Lönn, CEO, Load Impact

Starting today, we’re accepting applications for early access to our new Load Impact Continuous Delivery service, which for the first time allows developers of websites, apps and APIs to make stress testing an integrated part of their continuous delivery process.

Our Continuous Delivery service comprises our API, SDKs and a library of plug-ins and integrations for popular continuous delivery systems – starting with Jenkins.

In order to better serve our customers and allow them to integrate their Continuous Delivery methodology with the Load Impact platform, we’re building programming libraries that make the API super easy to use. We’re starting with Jenkins and will soon rollout plug-ins for TeamCity, New Relic and CloudBees.

Simply put, the Jenkins plug-in will integrate load testing into developers’ automated Jenkins test suite to determine whether new builds meet specified traffic performance criteria.

(Download the plug-in)

The new Jenkins plug-in features multi-source load testing from up to 12 geographically distributed locations worldwide, advanced scripting, a GUI based session recorder to easily create tests simulating multiple typical user scenarios, and our new Server Metrics Agent (SMA) for correlating the Server Side impact of users on CPU, memory, disk space and network usage.

Read more about how to automate your load testing here and here.

Apply now to join our private beta group and receive FREE unlimited load testing for the duration of the beta period!

Load Impact Blog

Tag Archives: new relic

Make Scalability Painless, by First Identifying your Pain Points

Write Scalable Code – use Jenkins to Automate your Load Testing

About Load Impact

Links

Follow Blog via Email