WordPress Vertical Scalability Part I: How Performance Varies with Changes in Hardware

How does your web application respond to improvements in the underlying hardware? Well, that will depend a lot on your application. Different applications are limited by different factors such as RAM, CPU, bandwidth, disk speed to name a few. In this article, I’ll show you an approach to finding out how to test your way to understanding how your application consumes resources.

At some point in the development cycle, preferably early, it makes good sense to narrow down what factors limit your application the most. It’s also useful to flip that statement around and ask yourself: what hardware improvements will benefit your overall performance the most? If you focus on the latter of the two statements, the solution is probably the most important information you need for good resource planning.

To demonstrate the concept of vertical scalability testing, (or hardware sensitivity testing), I’ve set up a very simple WordPress 3.8.1 installation and will examine how performance varies with changes in hardware. The tests are made using virtual machines where hardware changes are easy to make. I’ve created a simple but somewhat credible user scenario using the Load Impact User Scenario Recorder for Chrome.

The simulated users will:

  •  Surf to the test site
  •  Use the search box to search for an article
  •  Surf to the first hit in the search results
  •  Go back to the home page

The baseline configuration is very conservative:

  • CPU: 1 core
  • RAM: 128 Gb
  • Standard disks.

The test itself is a basic ramp up test going from 0 to 50 concurrent users. Based on experience from previous tests with WordPress, a low power server like this should not be able to handle 50 concurrent users running stock WordPress. The idea is to run the test until we start seeing failures. The longer it takes before we see failures, the better. In the graph below, the green line is number of simulated users, the blue line is average response time and the red line is the failure rate measured as number of failed requests/s. As you can see, the first failed requests are reported at 20 concurrent users.

baseline

A comment on the response times (blue line) going down. At a high enough load, nearly 100% of all responses are error messages. Typically, the error happens early in the request and no real work is carried out on the server. So don’t be fooled by falling response times as we add load, it just means that the server is quick to generate an error.

 

RAM Memory sensitivity

First, I’m interested to see how performance varies with available RAM. I’ve made the point in previous articles that many PHP based web applications are surprisingly hungry for RAM. So let’s see how our baseline changes with increased RAM:

At 256 Mb RAM (2x baseline):

RAM256

At 512  Mb RAM (4x baseline)

RAM512

 

That’s a quite nice correlation. We see that the number of simulated users that can be handled without a failure is moved higher and higher. At 1024 Mb RAM (8x baseline) we actually don’t get any error at all:

RAM1024

Also note that before the WordPress server spits out errors, there’s a clear indication on the response times. At a light load, any configuration can manage about 1s response time, but as the load increases and we’re nearing the point where we see errors, response times have already gone up.

 

Sensitivity to CPU cores

Next angle is to look at CPU core sensitivity. With more CPU available, things should move faster, right? RAM memory has been reset to 128 Mb, but now I’m adding CPU cores:

Using 2 CPU cores (2x baseline)

2xCPU

Ops! As you can see, this is fairly close to the baseline. First errors start happening at 20 concurrent users, so more CPU couldn’t do anything to help the situation once we run out of memory. For the sake of completeness, looking at using 4 CPU cores shows a tiny improvement, first errors appear at 23 concurrent users instead of 20.

Using 4 CPU cores (4x baseline)

4xCPU

Adding more CPU cores doesn’t seem to be my highest priority.

 

Next step, mixing and matching.

You’ve probably already figured out that 128 Mb RAM is too little memory to host a stock WordPress application. We’ve discussed WordPress specifically before and this is not the first time we realize that WordPress is hungry for RAM. But the point of this article wasn’t about that. Rather, I wanted to demonstrate a structured approach to resource planning.

In a more realistic scenario, you’d be looking for a balance between RAM, CPU and other resources. Rather than relying on various ‘rules of thumb’ of varying quality, performing the actual measurements is a practical way forward. Using a modern VPS host that let’s you mix and match resources, it’s quite easy to perform these tests. So the next step is your’s.

My next step will be to throw faster disks (SSD) into the mix. Both Apache/PHP and MySQL benefits greatly from running on SSD disks, so I’m looking forward to seeing those numbers.

Comments, questions or criticism? Let us know by posting a comment below:

——-

0b59bcbThis article was written by Erik Torsner. Erik is based in Stockholm, Sweden, and shares his time between being a technical writer and customer projects manager within system development in his own company. Erik co-founded mobile startup EHAND in the early 2000-nds and later moved on to work as technology advisor and partner at the investment company that seeded Load Impact. Since 2010, Erik manages Torgesta Technology. Read more about Erik on his blog at http://erik.torgesta.com or on Twitter @eriktorsner.

 

 

What to Look for in Load Test Reporting: Six Tips for Getting the Data you Need

Looking at graphs and test reports can be a befuddling and daunting task – Where should I begin? What should I be looking out for? How is this data useful or meaningful? Hence, here are some tips to steer you in the right direction when it comes to load testing result management.

For example, the graph (above) shows how the load times (blue) increase [1] as the service reaches its maximum bandwidth (red) limit [2], and subsequently how the load time increases even more as bandwidth drops [3]. The latter phenomenon occurs due to 100% CPU usage on the app servers.

When analyzing a load test report, here are the types of data to look for:

  • What’s the user scenario design like? How much time should be allocated within the user scenario? Are they geographically spread?

  • Test configuration settings: is it ramp-up only or are there different steps in the configuration?

  • While looking at the tests results, do you get an exponential growing (x²) curve? Is it an initial downward trend that plateaus (linear/straight line) before diving downwards drastically?

  • How does the bandwidth/requests-per-second look like?

  • For custom reporting and post-test management, can you export your test results to CSV format for further data extraction and analysis?

Depending on the layout of your user scenarios, how much time should be spent within a particular user scenario for all actions (calculated by total amount of sleep time), and how the users are geographically spread, you will likely end up looking at different metrics. However, below are some general tips to ensure you’re getting and interpreting the data you need.

Tip #1: In cases of very long user scenarios, it would be better to look at a single page or object rather than the “user load time” (i.e. the time it takes to load all pages within a user scenario excluding sleep times).

Tip #2: Even though “User Load Time” is a good indicator for identifying problems, it is better to dig in deeper by looking at individual pages or objects (URL) to get a more precise indication of where things have gone wrong. It may also be helpful to filter by geographic location as load times may vary depending on where the traffic is generated from.

Tip #3: If you have a test-configuration with a constant ramp-up and during that test the load time suddenly shoots through the roof, this is a likely sign that the system got overloaded a bit earlier than the results show. In order to gain a better understanding of how your system behaves under a certain amount of load, apply different steps in the test configuration to allow the system to calm down for approximately 15 minutes. By doing so, you will be able to obtain more and higher quality samples for your statistics.

Tip #4: If you notice load times are increasing and then suddenly starting to drop, then your service might be delivering errors with “200-OK” responses, which would indicate that something may have crashed in your system.

Tip #5: If you get an exponential (x²) curve, you might want to check on the bandwidth or requests-per-second. If it’s decreasing or not increasing as quickly as expected, this would indicate that there are issues on the server side (e.g. front end/app servers are overloaded). Or if it’s increasing to a certain point and then plateaus, you probably ran out of bandwidth.

Tip #6: To easily identify the limiting factor(s) in your system, you can add a Server Metrics Agent which reports performance metrics data from your servers. Furthermore, you could possibly export or download the whole test data with information containing all the requests made during the tests, including the aggregated data, and then import and query via MySQL database, or whichever database you prefer.

In a nutshell, the ability to extrapolate information from load test reports allows you to understand and appreciate what is happening within your system. To reiterate, here are some key factors to bear in mind when analyzing load test results:

  • Check Bandwidth

  • Check load time for a single page rather than user load time

  • Check load times for static objects vs. dynamic objects

  • Check the failure rate

  • For Server Metrics – check CPU and Memory usage status

……………….

 

1e93082This article was written by Alex Bergvall, Performance Tester and Consultant at Load Impact. Alex is a professional tester with extensive experience in performance testing and load testing. His specialities include automated testing, technical function testing, functional testing, creating test cases, accessibility testing , benchmark testing, manual testing, etc.

Twitter: @AlexBergvall

About Load Impact

Load Impact is the leading cloud-based load testing software trusted by over 123,000 website, mobile app and API developers worldwide.

Companies like JWT, NASDAQ, The European Space Agency and ServiceNow have used Load Impact to detect, predict, and analyze performance problems.
 
Load Impact requires no download or installation, is completely free to try, and users can start a test with just one click.
 
Test your website, app or API at loadimpact.com

Enter your email address to follow this blog and receive notifications of new posts by email.