30Oct / 2013

HealthCare.gov tech taskforce is invited to use our load testing services free of charge

We’re offering to provide the technology taskforce responsible for fixing the troubled HealthCare.gov website free use of our performance testing services until the Obamacare website is functioning at full capacity.

In testimony before the House Energy and Commerce Committee on Thursday, officials of companies hired to create the HealthCare.gov website cited a lack of testing on the full system and last-minute changes by the federal agency overseeing the online enrollment system as the primary cause of problems plaguing the government exchange for President Barack Obama’s signature health care reforms.

Moreover, according to a confidential report obtained by CNN, the testing timeframes for the site were “not adequate to complete full functional, system, and integration testing activities” and described the impact of the problems as “significant.” The report stated there was “not enough time in schedule to conduct adequate performance testing” and was given the highest priority.

We know that there’s really nothing new in the failure of the Obamacare site. Websites have been developed that way for years – often with the same results. But there are now new methodologies and tools changing all that. That’s why we’ve reached out to our California representatives and all of the companies involved to let them know we’re ready to provide our stress testing services to them free of charge.

It isn’t like it used to be – this shouldn’t be hard, time consuming or expensive. You just need to recognize that load testing is something that needs to be done early and continuously throughout the development process. It’s not optional anymore. Unfortunately, it seems they found that out the hard way. But we sincerely want to help make it work.

HealthCare.gov represents hope for many Americans, and the elimination of their worst fears in medical care. Instead of whining about how incompetently HealthCare.gov has been built, we want to be part of making it work as it should and can.

28Oct / 2013

Performance Testing Versus Performance Tuning

Performance testing is often mistaken for performance tuning. The two are related, but they are certainly not the same thing. To see what these differences are, let’s look at a quick analogy.

Most governments mandate that you bring your vehicles to the workshop for an inspection once a year. This is to ensure that your car meets the minimum safety standards that have been set to ensure it is safe for road use. A website performance test can be likened to a yearly inspection – It ensures that your website isn’t performing terribly and should perform reasonably well under most circumstances.

When the inspection shows that the vehicle isn’t performing up to par, we start running through a small series of checks to see how to get the problem solved in order to pass the inspection. This is similar to performance tuning, where we shift our focus to discovering what is necessary to making the application perform acceptably.

Looking in depth at the performance test results helps you to narrow down the problematic spots so you can identify your bottlenecks quicker. This in turn helps you to make optimization adjustments cost and time efficient.

Then we have the car enthusiasts. This group constantly works toward tuning their vehicle for great performance. Their vehicles have met the minimum performance criteria, but their goal now is to probably make their car more energy-efficient, or perhaps to run faster. Performance tuning goals are simply that – You might be aiming to reduce the amount of resources consumed to decrease the volume of hardware needed, and/or to get your website to load resources quicker.

Next, we will talk about the importance of establishing a baseline when doing performance tuning.

Tuning your website for consistent performance

Now that we know the difference between performance testing and performance tuning, let’s talk about why you will need a controlled environment and an established baseline prior to tuning web applications.

The importance of a baseline: Tuning your web application is an iterative process. There might be several factors contributing to poor website performance, and it is recommended to make optimization adjustments in small steps in a controlled test environment. Baselines help to determine whether an adjustment to your build or version improves or declines performance. If the conditions of your environment are constantly changing or too many large changes are made at once, it will be difficult to see where the impact of your optimization efforts come from.

To establish a baseline, try tracking specific criteria such as page load times, bandwidth, requests per second, memory and CPU usage. Load Impact’s server metrics helps to combine all these areas in a single graph from the time you run your first performance test. Take note of how these changes improve or degrade when you make optimization improvements (i.e. if you have made hardware upgrades).

Remember that baselines can evolve over time, and might need to be redefined if changes to the system have been made since the time the baseline was initially recorded.If your web application is constantly undergoing changes and development work, you might want to consider doing small but constant tests prior to, for instance, a new fix being integrated or a new version launch.

As your product development lifecycle changes, so will your baseline. Hence, doing consistent testing prior to a release helps save plenty of time and money by catching performance degradation issues early.

There is an increasing number of companies adopting a practice known as Continuous Integration. This practice helps to identify integration difficulties and errors through a series of automated checks, to ensure that code deployment is as smooth and rapid as possible.

If this is something that your company already practices, then integrating performance tuning into your product delivery pipeline might be as simple as using Load Impact’s Continuous Delivery Jenkins plugin. A plugin like this allows you to quickly integrate Jenkins with our API to allow for automated testing with a few simple clicks.

By Chris Chong (@chrishweehwee)

Permalink Leave a comment

07Oct / 2013

Website Owners’ Overestimation of User Capacity by 3.4 Times Kills Profits and Customer Retention

An e-commerce website that grinds to halt simply because there are too many customers attempting to gain access at one time is akin to a supermarket with no parking spaces and isles so narrow that only one shopper can enter at a time, while the rest sit outside waiting to enter and make a purchase.

Few website owners would accept such poor performance and potential loss of revenue, and even fewer consumers would tolerate the waiting time. Most would just move on to another site where service is speedy and meets their expectations.

Thankfully a small, but growing, number of website owners have realized that performance management and capacity monitoring are imperative for delivering even a satisfactory customer experience, let alone an exceptional one.

It’s no coincident that roughly 30% of 500 website owners surveyed for Load Impact’s 2013 State of Web Readiness report which includes data from performance tests of over 5,000 websites, claim to have no stability or performance problems, while about 30% of respondents also said they regularly do preventive load testing before technical changes. Those who foresee the problem take the necessary preventive steps. On the flip side, while nearly 90% of respondents said short response time is either important or very important, 23% of respondents said they don’t monitor the response time on their site at all.

SoWR- Graph-Stability Problems

How can such a gap exist when it’s so obvious that optimum performance leads to higher levels of customer satisfaction, increased conversion and greater revenue?

For the less mature e-commerce sites, current performance problems can be seen as both an opportunity and a threat. While some e-retailers are already far ahead, having scheduled load tests to maintain a sub 2 second response time (77% of all respondents believe response times should be less than 2 seconds), most haven’t even come close to realizing they have a problem. An analysis of over 5,000 load tests revealed that the average website owner overestimates capacity by 3.4 times. In fact, the average page load speed for the e-commerce sites analyzed was closer to 8 seconds – nearly twice the average latency of non e-commerce sites studied.

SoWR- Graph-Response Times

Clearly, big rewards can be reaped by making even small changes to website performance. A 2009 experiment by Shopzilla revealed that decreasing latency by 5 seconds (from 7 seconds to 2 seconds) increased page views by 25% and revenue by 7% to12%. And, according to SEO expert, Jason DeMers, load speed is one of the growing factors in Google’s ranking algorithm.

The Internet giants caught on to the issue of load testing and performance management long ago. Authors and consultants have written books about it, held conferences, and written blogs for years. Google even officially favors fast web sites in its search results, indirectly punishing low performers.

So why are so many e-retailers so slow to catch on about the importance of performance stability?

According to the 2013 State of Web Readiness report, lack of resources is identified as the No.1 reason for failing to monitor and optimize performance levels. However, this is only a part of the problem.

The real explanation has more to do with striking the right balance between functionality, performance and resources, and the fact that, more often than not, optimizing two of the three means sacrificing the third. Therefore, it is often the misallocation of resources that explains a site’s poor performance. Money and time that should have been spent monitoring capacity and load speed instead went to adding additional, often frivolous, functionality.

The lack of sufficient investment in performance management is extremely common. In some ways, buying performance management services is a bit like buying insurance, you understand why you need it, but if all goes as planned you don’t actually get a chance to see the value.

Being forward thinking enough to buy something so intangible is tough.

Other fundamental website issues, such as security, have slowly climbed the ladder of being class A requirements. Even the most technically illiterate now steer clear of any feature that comes with a security concern. From our vantage point, the time has come to give performance and stability management the same time and attention – if for no other reason than it’s simply smart business.

—————————–

Read or share this infographic based on our study’s findings.

Permalink 1 Comment

04Sep / 2013

Different types of website performance testing – Part 3: Spike Testing

This is the third of a series of posts describing the different types of web performance testing. In the first post, we gave an overview of what load testing is about and the different types of load tests available. Our second post gave an introduction to load testing in general, and described what a basic ramp-up schedule would look like.

We now we move on to spike testing. Spike testing is another form of stress testing that helps to determine how a system performs under a rapid increase of workload. This form of load testing helps to see if a system responds well and maintains stability during bursts of concurrent user activity over varying periods of time. This should also help to verify that an application is able to recover between periods of sudden peak activity.

So when should you run a spike test?

The following are some typical example case scenarios where we see users running a spike test, and how your load schedule should be configured in Load Impact to emulate the load.

Advertising campaigns

Advertising campaigns are one of the most common reasons why people run load tests. Why? Well, take a lesson from Coca Cola – With an ad spend of US$3.5 million for a 30 second Superbowl commercial slot (not including customer cost), it probably wasn’t the best impression to leave for customers who flooded to their Facebook app.. and possibly right into Pepsi’s arms. If you’re expecting customers to flood in during the ad campaign, ramping up in 1-3 minutes is probably a good idea. Be sure to hold the load time for at least twice the amount of time it takes users to run through the entire scenario so you get accurate and stable data in the process.

Contests

Some contests require quick response times as part of the challenge issued to users. The problem with this is that you might end up with what is almost akin to a DDOS attack every few minutes. A load schedule comprising of a number of sudden spikes would help to simulate such a situation.

TV screenings/Website launches

If you’re doing a live stream of a very popular TV show (think X Factor), you might want to consider getting a load test done prior to the event. Slow streaming times or a website crash is the fastest way to drive your customers to the next streaming app/online retailer available. Take Click Frenzy as an example – they’re still working to recover their reputation. Streaming servers also tend to be subject to prolonged stress when many users all flock to watch an event or show, so we recommend doing a relatively quick ramp up and ending with a long holding time.

Ticket sales

Remember the 2012 London Olympics? Thousands of frustrated sports fans failed to get tickets to the events they wanted. Not only was it a waste of time for customers, but it also served to be a logistics nightmare for event organizers. Bearing in mind that a number of users would be ‘camping’ out on the website awaiting ticket launch, try doing a two stage quick ramp up followed by a long holding time to simulate this traffic.

TechCrunched… literally!

If you are trying to get yourself featured on TechCrunch (or any similar website that potentially might generate a lot of readership), it’s probably a good idea to load test your site to make sure it can handle the amount of traffic. It wouldn’t do to get so much publicity and then have half of them go away with a bad taste in their mouth! In these cases, traffic tends to come in slightly slower and in more even bouts over longer periods of time. A load schedule like this would probably work better:

Secondary Testing

If your test fails at any one point of time during the initial spike test, one of the things you might want to consider doing is a step test. This would help you to isolate where the breaking points are, which would in turn allow you to identify bottlenecks. This is especially useful after a spike test, which could ramp up too quickly and give you an inaccurate picture of when your test starts to fail.

That being said, not all servers are built to handle huge spikes in activity. Failing to handle a spike test does not mean that these same servers cannot handle that amount of load. Some applications are only required to handle large constant streams of load, and are not expected to face sharp spikes of user activity. It is for this reason that Load Impact automatically increases ramp-up times with user load. These are suggested ramp up timings, but you can of course adjust them to better suit your use case scenario.

01Feb / 2013

Node.js vs PHP – using Load Impact to visualize node.js efficiency

It could be said that Node.js is the new darling of web server technology. LinkedIn have had very good results with it and there are places on the Internet that will tell you it can cure cancer.

In the mean time, the old work horse language of the Internet, PHP, gets a steady stream of criticism. and among the 14k Google hits for “PHP sucks” (exact term), people will say the most funny terrible things about the language while some of the critique is actually quite well balanced. Node.js introduces at least two new things (for a broader audience). First, the ability to write server side JavaScript code. In theory this could be an advantage since JavaScript is more important than ever on the client side and using the same language on server and browser would have many benefits. That’s at least quite cool.

The other thing that makes Node.js different is that it’s completely asynchronous and event driven. Node is based on the realization that a lot of computer code actually just sits idle and wait for I/O most of the time, like waiting for a file to be written to disk or for a MySQL query to return data. To accomplish that, more or less every single function in Node.js is non-blocking.

When you ask for node to open a file, you don’t wait for it to return. Instead, you tell node what function to pass the results to and get on with executing other statements. This leads to a dramatically different way to structure your code with deeply nested callbacks and anonymous function and closures. You end up with something like this:

doSomething(val, function(err,result){
  doSomethingElse(result,function(err,res){
    doAbra();
    doKadabra(err, res, function() {
      ...
      ...
    });
  });
});

It’s quite easy to end up with very deep nesting that in my opinion sometimes affects code readability in a negative way. But compared to what gets said about PHP, that’s very mild critique. And.. oh! The third thing that is quite different is that in Node.js, you don’t have to use a separate http(s) server. It’s quite common to put Node.js behind a Nginx, but that’s not strictly needed. So the heart of a typical Node.js web application is the implementation of the actual web server.

A fair way to compare

So no, it’s not fair to say that we compare Node.js and PHP. What we really compare is Node.js and PHP+Apache2 (or any other http server). For this article, I’ve used Apache2 and mod_php since it’s by far the most common configuration. Some might say that I’d get much better results if I had used Nginx or Lighthttpd as the http server for PHP. That’s most likely very true, but at the end of the day, server side PHP depends on running in multiple separate processes. Regardless if we create those processes with mod_php or fastcgi or any other mechanism. So, I’m sticking with the standard server setup for PHP and I think that makes good sense.

The testing environment

So we’re pitting PHP+Apache2 against a Node.js based application. To keep things reasonable, I’ve created a very (really, very) simple application in both PHP5 and Node.js. The application will get 50 rows of data from a WordPress installation and output it as a json string. That’s it, nothing more. The benefit of keeping it this simple was (a) that I didn’t have to bother about too many implementation details between the two languages and (b) more important that we’re not testing my ability to code, we’re really testing the difference in architecture between the two. The server we’re using for this test is a virtual server with:

1 x Core Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
2 Gb RAM.
OS is 64 Bit Ubuntu 12.10 installed fresh before running these tests.
We installed the Load Impact Server metric agent.

For the tests, we’re using:

Apache/2.2.22 and
PHP 5.4.6.
Node.js version 0.8.18 (built using this script)
MySQL is version 5.5.29.
The data table in the tests is the options table from a random WordPress blog.

The scripts we’re using:

Node.js (javascript):

// Include http module,
var http = require('http'),
mysql = require("mysql");

// Create the connection.
// Data is default to new mysql installation and should be changed according to your configuration.
var connection = mysql.createConnection({
   user: "wp",
   password: "****",
   database: "random"
});

// Create the http server.
http.createServer(function (request, response) {
   // Attach listener on end event.
   request.on('end', function () {

      // Query the database.
      connection.query('SELECT * FROM wp_options limit 50;', function (error, rows, fields) {
         response.writeHead(200, {
            'Content-Type': 'text/html'
         });
         // Send data as JSON string.
         // Rows variable holds the result of the query.
         response.end(JSON.stringify(rows));
      });
   });
// Listen on the 8080 port.
}).listen(8080);

PHP code:

<!--?php $db = new PDO('mysql:host=localhost;dbname=*****',     'wp',     '*****'); $all= $db--->query('SELECT * FROM wp_options limit 50;')->fetchAll();
echo json_encode($all);

The PHP script is obviously much shorter, but on the other hand it doesn’t have to implement a full http server either.

Running the tests

The Load Impact test configurations are also very simple, these two scripts are after all typical one trick ponies, so there’s not that much of bells and whistles to use here. To be honest, I was surprised how many concurrent users I had to use in order to bring the difference out into the light. The test scripts had the following parameters:

The ramp up went from 0-500 users in 5 minutes
100% of the traffic comes from one source (Ashburn US)
Server metrics agent enabled

The graphics:

On the below images. the lines have the following meanings:

Green line: Concurrent users
Blue line: Response time
Red line: Server CPU usage

Node.js up to 500 users.

The first graph here shows what happens when we load test the Node.js server. The response time (blue) is pretty much constant all through the test. My back of a napkin analysis of the initial outliers is that they have to do with a cold MySQL cache. Now, have a look at the results from the PHP test:

Quite different results. It’s not easy to see on this screen shot, but the blue lines is initially stable at 320 ms response time up to about 340 active concurrent users. After that, we first see a small increase in response time but after additional active concurrent users are added, the response time eventually goes through the roof completely.

So what’s wrong with PHP/Apache?

Ok, so what we’re looking at is not very surprising, it’s the difference in architecture between the two solutions. Let’s think about what goes on in each case.

When Apache2 serves up the PHP page it leaves the PHP execution to a specific child process. That child process can only handle one PHP request at a time so if there are more requests than than, the others have to wait. On this server, there’s a maximum of 256 clients (MaxClients) configured vs 150 that comes standard. Even if it’s possible to increase MaxClients to well beyond 256, that will in turn give you a problem with internal memory (RAM). At the end, you need to find the correct balance between max nr of concurrent requests and available server resources.

But for Node, it’s easier. First of all, in the calm territory, each request is about 30% faster than for PHP, so in pure performance in this extremely basic setup, Node is quicker. Also going for Node is the fact that everything is in one single process on the server. One process with one active request handling thread. So thre’s no inter process communication between different instances and the ‘mother’ process. Also, per request, Node is much more memory efficient. PHP/Apache needs to have a lot of php and process overhead per concurrent worker/client while Node will share most of it’s memory between the requests.

Also note that in both these tests, CPU load was never a problem. Even if CPU loads varies with concurrent users in both tests it stays below 5% (and yes, I did not just rely on the graph, I checked it on the server as well). (I’ll write a follow up on this article at some point when I can include server memory usage as well). So we haven’t loaded this server into oblivion in any way, we’ve just loaded it hard enough for the PHP/Aapache architecture to start showing some of it’s problems.

So if Node.js is so good…

Well of course. There are challenges with Node, both technical and cultural. On the technical side, the core design idea in Node is to have one process with one thread makes it a bit of a challenge to scale up on a multi core server. You may have already noted that the test machine uses only one core which is an unfair advantage to Node. If it had 2 cores, PHP/Apache would have been able to use that, but for Node to do the same, you have to do some tricks.

On the cultural side, PHP is still “everywhere” and Node is not. So if you decide to go with Node, you need to prepare to do a lot more work yourself, there’s simply nowhere near as many coders, web hotels, computer book authors, world leading CMS’es and what have you. With PHP, you never walk alone.

Conclusion

Hopefully, this shows the inherit differences in two different server technologies. One old trusted and one young and trending. Hopefully it’s apparent that your core technical choices will affect your server performance and in the end, how much load you can take. Designing for high load and high scalability begins early in the process, before the first line of code is ever written.

And sure, in real life, there are numerous of tricks available to reduce the effects seen here. In real life, lots of Facebook still runs on PHP.

Permalink 21 Comments

Newer posts →

Load Impact Blog

Tag Archives: Performance testing