Application programming interface

Make Scalability Painless, by First Identifying your Pain Points

This post was originally written for:

………..

With many, if not most, applications, it is common that a very small part of the code is responsible for nearly all of the application response time. That is, the application will spend almost all of its time executing a very minor part of the code base.

In some cases, this small part of code has been well optimized and the application is as fast as can reasonably be expected. However, this is likely the exception rather than the rule.

It might also be that the real delay happens in external code – in a third-party application depended on.

Regardless of where a performance bottleneck lies, half of the work in fixing it (or working around it) is usually spent identifying where it’s located.

Step 1: Understand how your backend is being utilized.

One of the first things you must do to identify your pain points is to understand how your backend is being utilized.

For example, if your application backend functionality is exposed through a public API that clients use, you will want to know which API functions are being called, and how often and at what frequency they are being called.

You might also want to use parameter data for the API calls that are similar to what the application sees during real usage.

Step 2: Combine performance testing with performance monitoring to locate bottlenecks.

The second, and more important, step to take is to combine performance testing with performance monitoring in order to nail down where the problems lie.

When it comes to performance testing, it’s usually a matter of experimenting until you find the point at which things either start to fall apart, often indicated by transaction times suddenly increasing rapidly, or just stop working.

When you run a test and reach the point at which the system is clearly under stress, you can then start looking for the bottleneck(s). In many cases, the mere fact that the system is under stress can make it a lot easier to find the bottlenecks.

If you know or suspect your major bottlenecks to be in your own codebase, you can use performance monitoring tools to find out exactly where the code latency is happening.

By combining these two types of tools – performance testing and performance monitoring – you will be able to optimize the right parts of the code and improve actual scalability.

Let’s use an example to make this point clear.

Let’s say you have a website that is accessed by users using regular web browsers. The site infrastructure consists of a database (SQL) server and a web server. When a user accesses your site, the web server fetches data from the database server, then it performs some fairly demanding calculations on the data before sending information back to the user’s browser.

Now, let’s say you’ve forgotten to set-up an important database table index in your database – a pretty common performance problem experienced with SQL databases. In this case, if you only monitor your application components – the physical servers, the SQL server and the web server – while a single user is accessing your site, you might see that the database takes 50 ms to fetch the data and the calculations performed on the web server take 100 ms. This may lead you to start optimizing your web server code because it looks as if that is the major performance bottleneck.

However, if you submit the system to a performance test which simulates a large number of concurrent users with, let’s say, ten of those users loading your web site at exactly the same time, you might see that the database server now takes 500 ms to respond, while the calculations on the web server take 250 ms.

The problem in this example is that your database server has to perform a lot of disk operations because of the missing table index, and those scale linearly (at best) with increased usage because the system has only one disk.

The calculations, on the other hand, are each run on a single CPU core, which means a single user will always experience a calculation time of X (as fast as a single core can perform the calculation), but multiple concurrent users will be able to use separate CPU cores (often 4 or 8 on a standard server) and experience the same calculation time, X.

Another potential scalability factor could be if calculations are cached, which would increase scalability of the calculations. This would allow average transaction times for the calculations to actually decrease with an increased number of users.

The point of this example is that, until you submit a system to real heavy traffic, you have really no idea how it will perform when lots of people are using the system.

Put bluntly, optimizing the parts of the code you identified as performance bottlenecks when being monitored may end up being a total waste of time. It’s a combination of monitoring and testing that will deliver the information you need to properly scale.

By: Ragnar Lönn, CEO, Load Impact

Different types of website performance testing – Part 3: Spike Testing

This is the third of a series of posts describing the different types of web performance testing. In the first post, we gave an overview of what load testing is about and the different types of load tests available. Our second post gave an introduction to load testing in general, and described what a basic ramp-up schedule would look like.

We now we move on to spike testing. Spike testing is another form of stress testing that helps to determine how a system performs under a rapid increase of workload. This form of load testing helps to see if a system responds well and maintains stability during bursts of concurrent user activity over varying periods of time. This should also help to verify that an application is able to recover between periods of sudden peak activity.

So when should you run a spike test?

The following are some typical example case scenarios where we see users running a spike test, and how your load schedule should be configured in Load Impact to emulate the load.

Advertising campaigns

Advertising campaigns are one of the most common reasons why people run load tests. Why? Well, take a lesson from Coca Cola – With an ad spend of US$3.5 million for a 30 second Superbowl commercial slot (not including customer cost), it probably wasn’t the best impression to leave for customers who flooded to their Facebook app.. and possibly right into Pepsi’s arms. If you’re expecting customers to flood in during the ad campaign, ramping up in 1-3 minutes is probably a good idea. Be sure to hold the load time for at least twice the amount of time it takes users to run through the entire scenario so you get accurate and stable data in the process.

Contests

Some contests require quick response times as part of the challenge issued to users. The problem with this is that you might end up with what is almost akin to a DDOS attack every few minutes. A load schedule comprising of a number of sudden spikes would help to simulate such a situation.

TV screenings/Website launches

If you’re doing a live stream of a very popular TV show (think X Factor), you might want to consider getting a load test done prior to the event. Slow streaming times or a website crash is the fastest way to drive your customers to the next streaming app/online retailer available. Take Click Frenzy as an example – they’re still working to recover their reputation. Streaming servers also tend to be subject to prolonged stress when many users all flock to watch an event or show, so we recommend doing a relatively quick ramp up and ending with a long holding time.

Ticket sales

Remember the 2012 London Olympics? Thousands of frustrated sports fans failed to get tickets to the events they wanted. Not only was it a waste of time for customers, but it also served to be a logistics nightmare for event organizers. Bearing in mind that a number of users would be ‘camping’ out on the website awaiting ticket launch, try doing a two stage quick ramp up followed by a long holding time to simulate this traffic.

TechCrunched… literally!

If you are trying to get yourself featured on TechCrunch (or any similar website that potentially might generate a lot of readership), it’s probably a good idea to load test your site to make sure it can handle the amount of traffic. It wouldn’t do to get so much publicity and then have half of them go away with a bad taste in their mouth! In these cases, traffic tends to come in slightly slower and in more even bouts over longer periods of time. A load schedule like this would probably work better:

Secondary Testing

If your test fails at any one point of time during the initial spike test, one of the things you might want to consider doing is a step test. This would help you to isolate where the breaking points are, which would in turn allow you to identify bottlenecks. This is especially useful after a spike test, which could ramp up too quickly and give you an inaccurate picture of when your test starts to fail.

That being said, not all servers are built to handle huge spikes in activity. Failing to handle a spike test does not mean that these same servers cannot handle that amount of load. Some applications are only required to handle large constant streams of load, and are not expected to face sharp spikes of user activity. It is for this reason that Load Impact automatically increases ramp-up times with user load. These are suggested ramp up timings, but you can of course adjust them to better suit your use case scenario.

Load Impact Blog

Tag Archives: Application programming interface