When to Leverage Commercial Load Testing Services, and When to Go it Alone

How and where you execute load and performance testing is a decision that depends on a number of factors in your organization and even within the application development team.

It is not a clear cut decision that can be made based on the type of application or the number of users, but must be made in light of organizational preferences, cadence of development, timeline and of course the nature of the application itself and technical expertise currently on staff.

In this post we will provide some context around some of the key decision points that companies of all size should consider when putting together load NS performance testing plans.

This discussion is really an amalgamation of On-Premise versus SaaS/Open-Source versus Commercial Services.

In the load testing space there are commercial offerings that offer both SaaS and on-premise solutions as well as many SaaS only solutions for generating user load.

From an open source perspective, JMeter is the obvious choice (there are other less popular options such as FunkLoad, Gatling, Grinder, SOAPUI, etc). Having said that, let’s look at the advantages and challenges of the open source solution,  JMeter, and contrast that with a cloud-based commercial offering.

Key JMeter Advantages:

  1. 100% Java application so it can be run on any platform (windows, osx, linux) that can run Java.
  2. Ability to test a variety of types of servers – not just front end HTTP servers.  LDAP, JMS, JDBC, SOAP, FTP are some of the more popular services that JMeter can load test out of the box.
  3. Extensible, plug-in architecture. The open source community is very active in development around JMeter plugins and many additional capabilities exist to extend reporting, graphing, server resource monitoring and other feature sets.  Users can write their own plugins if desired as well.  Depending on how much time and effort is spent there is little that JMeter can’t be made to do.
  4. Other than the time to learn the platform there is no software cost of course since it is open source.  This may be of particular value to development teams with limited budget or who have management teams who prefer to spend on in-house expertise versus commercial tools.
  5. It can be easy to point the testing platform at a development server and not have to engage the network or server team to provide external access for test traffic.  It’s worth noting that while this is easier it is also less realistic in terms of real world results.

Key JMeter Disadvantages:

  1. Being that it is open source you do not have an industry vendor to rely upon for support, development or expertise.  This doesn’t mean that JMeter isn’t developed well or that the community isn’t robust – quite the opposite. Depending on the scope of the project and visibility of the application it can be very helpful to have industry expertise available and obligated to assist.  Putting myself in a project manager’s shoes, would I be comfortable telling upper management, “we thoroughly tested the application with an open source tool with assistance from forums and mailing lists?” if there were to be a major scale issue discovered in production?
  2. It’s very easy to end up with test results that aren’t valid.  The results may be highly reliable – but reliably measuring bottlenecks that have nothing to do with the application infrastructure isn’t terribly useful.  Since JMeter can be run right from a desktop workstation, you can quickly run into network and CPU bottlenecks from the testing platform itself – ultimately giving you unrealistic results.
  3. Large scale tests – not in the wheelhouse of JMeter.  Right in the documentation (section 16.2 of best practices) is a warning about limiting numbers of threads.  If a truly large scale test is required you can build a farm of test servers orchestrated by a central controller, but this is getting pretty complicated, requires dedicated hardware and network resources, and still isn’t a realistic real-world scenario anyway.
  4. The biggest disadvantage is inherent in all on-premise tools in this category in that it is not cloud based.  Unless you are developing an in-house application and all users are on the LAN, it does not makes a ton of sense to rely (entirely) on test results from inside your network.  I’m not suggesting they aren’t useful but if users are geographically distributed then testing in that mode should be considered.
  5. Your time: doing everything yourself is a trap many smart folks fall into, and often times at the expense of project deadlines, focus. Your time is valuable and in most cases it could be better spent somewhere else.

This discussion really boils down to if you like to do things yourself or if the project scope and criticality dictate using commercial tools and expertise.

For the purposes of general testing, getting familiar with how load testing works and rough order of magnitude sizing, you can certainly use open source tools on your own – with the caveats mentioned.  If the application is likely to scale significantly or have users geographically distributed, then I do think using a cloud based service is a much more realistic way to test.

In addition to the decision of open source versus commercial tools is if professional consulting services should be engaged.  Testing should be an integral part of the development process and many teams do not have expertise (or time) to develop a comprehensive test plan, script and configure the test, analyse the data and finally sort out remediation strategies on their own.

This is where engaging experts who are 100% focused on testing can provide real tangible value and ensure that your application scales and performs exactly as planned.

A strategy I have personally seen work quite well with a variety of complex technologies is to engage professional services and training at the onset of a project to develop internal capabilities and expertise, allowing the organization to extract maximum value from the commercial product of choice.

I always recommended to my customers to budget for training and service up front with any product purchase instead of trying to shoe-horn it in later, ensuring new capabilities promised by the commercial product are realized and management is satisfied with the product value and vendor relationship.

——

Peter CannellThis post was written by Peter Cannell. Peter has been a sales and engineering professional in the IT industry for over 15 years. His experience spans multiple disciplines including Networking, Security, Virtualization and Applications. He enjoys writing about technology and offering a practical perspective to new technologies and how they can be deployed. Follow Peter on his blog or connect with him on Linkedin.

Don’t miss Peter’s next post, subscribe to the Load Impact blog by clicking the “follow” button below. 

Load Testing Validates Performance Benefits of CDN – 400% Improvement (CASE STUDY)

Ushahidi used Load Impact to greatly improve the performance of its software. Through comparing “before” and “after” test results it was possible to see the performance impact of optimization efforts – like the use of a CDN. 

Ushahidi is a non-profit tech company that specializes in developing free and open source software for information collection, visualization and interactive mapping. Such software is deployed during disasters so that real time information can be shared across the web. Like WordPress, the software can be self hosted or hosted on the company’s server.

Case:

Ushahidi software is generally used for crisis and disaster situations so optimization is absolutely crucial. An earthquake reporting site based on Ushahidi software (http://www.sinsai.info/) received a spike in traffic after the earthquake and tsunami in Japan and it went down several times, causing service outage at the time the service was needed the most.

Ushahidi were interested in using a load testing tool to test the performance of their software before and after optimization efforts, to determine what effect the optimizatons had had.

Test setup:

There were four load tests run on two different versions of the Ushahidi software. The software was hosted on Ushahidi’s servers. The first two test runs used ramp-up configurations up to 500 concurrent users on the test sites to test performance differences between Ushahidi 2.0.1 and Ushahidi 2.1. The results were revealing, showing performance graphs that were practically identical. There hadn’t been any change in performance from 2.0.1 to 2.1.

From these tests, it was also found out that the theoretical total number of concurrent users for Ushahidi on a typical webserver is about 330 clients but may be lower, depending on configuration. Load times at the 330-client level were very high, however, and defining the largest acceptable page load time to be 10 seconds meant that a more realistic figure would be 100 concurrent users on the typical webserver.

Finally, Ushahidi wanted to measure the potential performance gain when using a CDN (content delivery network). The Ushahidi 2.1 software was modified so that static resources were loaded from Rackspace’s CDN service instead of the Ushahidi server, then the previous load test was executed again.

The result was a major increase in the number of concurrent users the system could handle. Where previous tests had shown a significant slowdown after 60-100 concurrent users, and an absolute max limit of about 330 concurrent users, the CDN-enabled site could handle more than 300 concurrent users before even starting to slow down. To find out the extreme limit of the site with CDN enabled, a final test was run with even higher load levels, and it was found that the server now managed to serve content at load levels up to 1,500 concurrent users, although with the same high load times as in the 330-client case with no CDN.

Service environment:

  • Apache
  • PHP
  • MySQL
  • Linux (CentOS 5.0)

Challenges:

  • Find load limits for 2 different software versions
  • Find load limits with/without CDN enabled for static files
  • Detect potential problems in the infrastructure or web app before they affect customers

Solution:

  • Run ramp-up tests with identical configurations on the 2.01 and the 2.1 software. See which one performs better or worse
  • Run ramp-up tests with identical configurations on the 2.1 software with CDN enabled, and without CDNenabled. See which performs better or worse.
  • Run final, large-volume ramp-up test for the CDN-enabled software, to find out its theoretical maximum concurrent user limit.

Results:

  • Ushahidi found out that there was a significant performance gain when using CDN to serve their static files.
  • Load test measured that performance increased by 300% – 400% when using the CDN
  • Load times started to increase only after 334 concurrent users when using the CDN, and the server timed out at around 1500 concurrent users.
  • Faster time to verify CDN deployment. Test also quantified % increase in performance which leads to justification for additional cost of CDN service.
  • Test showed no changes in load time between version 2.01 and 2.10.

5 Ways to Better Leverage a DevOps Mindset in Your Organization

The last few years have given rise to the “DevOps” methodology within many organizations both large and small. While definitions vary somewhat, it boils down to this: breaking down silos between developers and operations.

This seems like a common sense approach to running a business, right?

While many organizations do have a DevOps mindset, I find myself regularly talking to IT staff where there is near zero collaboration between applications teams, network and security. In highly silo-ed organizations these teams can actually work against each other and foster significant animosity. Not my idea of an efficient and agile organization!

Organizations that use a DevOps mindset will deploy applications and capabilities significantly faster and with fewer operational issues from what the industry is reporting.  According to Puppet Labs:

High performing organizations deploy code 30 times more often, and 8000 times faster than their peers, deploying multiple times a day, versus an average of once a month.

It is extremely important that applications teams are creating code and applications in a way that can be properly supported, managed and operationalized by the business. Here are some tips to best leverage this type of approach in any organization:

1. It’s not (entirely) about tools

Everyone loves to buy new technology and tools.  The problem is that often times products are only partially deployed, and capabilities go unused and sit on the shelf. And if you think starting to use some new products and tools will make your organization DevOps enabled, think again.

Building a DevOps culture is much more about taking two parts of the organization whose roots are quite different and bringing them together with a shared vision and goal. Think about it: operations looks at change as the reason the last downtime occurred and App-Dev is constantly trying to evolve and elicit disruptive change. No product or tool is going to make this all happen for you. So start with this in mind.

2. Communication and goals are absolutely critical

This is going to sound really obvious and boring, but if your ops and apps teams are not communicating – not working towards a shared set of goals – everyone is vested if you have a problem.

Defining what the organizational goals are in terms of real concrete objectives that meet the SMART criteria is the right place to start.  I’ll bet most organizations do not have goals that meet this level of specificity so I’ll provide a good and bad example:

  • Bad goal: “We want to be the leader in mobile code management”
  • Good goal: “We will be the leader in mobile code management by June 30th of 2015 as measured by Garnter’s magic quadrant, with revenues exceeding $25m in 2Q 2015″

See the difference?  Even the casual observer (who doesn’t even know what this fictitious space of mobile code management is) could tell if you met the second goal. Great. Now that we have a real concrete goal the organization can put an action plan in place to achieve those goals.

Communication can be a real challenge when teams have different reporting structures and are in different physical locations.  Even if folks are in the same building it’s really important for face to face, human interaction. It’s certainly easier to send an email or text but nothing beats in-person interaction with a regular cadence. Collaboration tools will certainly come into play as well – likely what you already have in place but there are new DevOps communications tools coming to market as well.  But first start with team meetings and breaking down barriers.

3. Practice makes perfect: continuous integration, testing and monitoring

DevOps is about short-circuiting traditional feedback control mechanisms to speed up all aspects of an application roll-out.  This means exactly the opposite of what we typically see in many large software programs and has been particularly acute within large government programs, or at least more visible.

Striving for perfection is certainly a worthy goal, but we should really be striving for better.  This means along the way risks will need to be taken, failures will happen and course corrections put in place.  It is important to realize that this whole DevOps change will be uncomfortable at first, but taking the initial steps and perfecting those steps will help build momentum behind the initiative.

Instead of trying to do every possible piece of DevOps all at once, start with one component such as GIT and learn how to really manage versioning well.Then start working with cookbooks and even use Chef to deploy Jenkins, cool eh?

It’s probably also worth noting that training and even hiring new talent could be a key driving factor in how quickly you implement this methodology.

4. Having the right tools helps

Like I said earlier, everyone loves new tools.. I love new tools!  Since this whole DevOps movement is quite new you should realize that the marketplace is evolving rapidly. What is hot and useful today could not be what you thought you needed tomorrow.

If you already have strong relationships with certain vendors and VAR partners this would be a great time to leverage their expertise in this area (assuming they have it) to look at where gaps exist and where the quick wins are.  If platform automation and consistency of configuration is the right place for the organization to start then going with Chef or Puppet could make sense.

I think the important factors here are:

      • What are your requirements?
      • What do you have budget do acquire and manage?
      • Do you have partners who can help you with requirements and matching up different vendors or service offerings?

Since this could easily turn into a whole series of blog posts on DevOps tools, I’m not going to go through all the different products out there. But if you can quickly answer the questions above, then get moving and don’t allow the DevOps journey to stall at this phase.

If it’s difficult to figure out exactly what requirements are important or you don’t have good partners to work with, then go partner with some of the best out there or copy what they are doing.

5. Security at the pace of DevOps

What about security? Building in security as part of the development process is critical to ensuring fatal flaws do not permeate a development program. Unfortunately, often times this is an afterthought.

Security hasn’t kept pace with software development by any metric so taking a fresh look at techniques and tools has to be done.

Static analysis tools and scanners aren’t terribly effective anymore (if they were to begin with). According to Contrast Security’s CTO and Founder, Jeff Williams, we should be driving towards continuous application security (aka. Rugged DevOps):

“Traditional application security works like waterfall software development – you perform a full security review at each stage before proceeding. That’s just incompatible with modern software development. Continuous application security (also known as Rugged DevOps) is an emerging practice that revolves around using automation and creating tests that verify security in real time as software is built, integrated, and operated. Not only does this eliminate traditional appsec bottlenecks, but it also enables projects to innovate more easily with confidence that they didn’t introduce a devastating vulnerability.”  – Jeff Williams

While DevOps is all about streamlining IT and bringing new applications to market faster, if you don’t ensure that the application can perform under a realistic load in a way real world users interact, there will be problems.

Likewise if an application is rolled out with security flaws that are overlooked or ignored, it could be game over for not only the business but quite possibly the CEO as well. Just look to Target as a very recent example.

It is clear that an integrated approach to developing applications is valuable to organizations, but if you don’t look at the whole picture – operational issues, performance under load and security, you could find out that DevOps was a fast track to disaster. And obviously no one wants that.

 

—————

Peter CannellThis post was written by Peter Cannell. Peter has been a sales and engineering professional in the IT industry for over 15 years. His experience spans multiple disciplines including Networking, Security, Virtualization and Applications. He enjoys writing about technology and offering a practical perspective to new technologies and how they can be deployed. Follow Peter on his blog or connect with him on Linkedin.

Scenario Testing: Four Tips on How to Manage Effectively

This article was originally written for Software Testing Professionals.

————

Testing software has always been complex. The minute you add more than a handful of features to any system, theoretical complexity sky rockets.

All the buttons to click, links to follow, client browser versions, client bandwidth and what have you, will soon add up to a near infinite number of things you’d need to test. At the same time, actual users will only engage with a small fraction of those features.

But how does one bring some order to this complexity when, barring a few exceptions, 100% test coverage is basically impossible to achieve?

One approach is to turn to scenario testing. This is where you use a real or hypothetical story that describes how a user actually uses the application. It may sounds very similar to a test case, but a test case is typically single step, whereas scenario tests cover a number of interconnected steps.

A good scenario is one that is based on a credible story of a user performing a complex task. The scenario should be:

  • Critical to all stakeholders (i.e sales, marketing, management, customer support)
  • It should be obvious that the scenario must work as expected
  • The scenario must be easy to evaluate

Scenario testing is primarily thought of as a tool to facilitate feature testing, but performance is sometimes part of that.

Since application performance is very often a non-functional requirement, satisfactory performance is often assumed and lack there of is considered a bug – even if it was never mentioned in the requirements.

Therefore, scenario testing can be used to uncover important performance bottlenecks as well as test features.

Consider this test case for an e-commerce site.

Test case #1: Add valid coupon worth X%
Steps:

  1. Add one or more products to the cart
  2. Go to the checkout page
  3. Add coupon code ‘TEST123’ and click ‘Add coupon’

Expected result:

  • Page refreshes. Message “coupon successfully applied” is visible
  • Discount clearly indicated in the cart summary section
  • Cart total is reduced by X%

Now, imagine that the test case is performed and confirmed to work during development. One of the testers makes a note saying that sometimes, the page refresh takes 4-5 seconds when you have more than 10 products in the cart, but it’s not considered a major issue since it’s affecting a very small number of users.

Now, consider an actual user, Lisa, as she uses the e-commerce site:

Lisa gets a 15% coupon code for an eCommerce site she’s used before and really likes. She decides to order a few things she needs and also asks her mother and sister if they need anything. While she’s shopping, she talks once with her mother and three times with her sister to double check she get’s correct amount, sizes and colors of all items.

After about 20 minutes, Lisa has 20 items worth $900 in her cart. She hits the checkout page where she enters the discount code. The page seems to be doing ‘something’ but after 5 seconds with no visual feedback, Lisa decides that it’s most likely expected behaviour and hits ‘Pay now‘ to proceed. She’s a little worried that she can’t see her discount on the screen, but assumes that it will be presented on the emailed receipt.

Five minutes after completed checkout, she receives the receipt and realizes that she didn’t get the discount. At this point, Lisa feels let down and decides to try to cancel the order. Maybe she will try again later, maybe not.

The story of Lisa’s real world shopping experience makes a great base for a test scenario. A credible story of a user performing a complex task. It highlights to relevant stakeholders – like sales, marketing, management, customer support – that it’s important functionality that really needs to work.

It is, of course, possible to write a few test cases that would capture the same performance issue, but by putting the steps into a realistic and credible context, the coupon code response time suddenly stands out as an important issue.

It suddenly becomes easier to spot and it becomes apparent that, even if it’s a small fraction of all http requests to the server, it will likely seriously affect a customer that wishes to make a rather large transaction. Which, I would like to point out, was the main reason the marketing/sales team wanted to distribute the coupon code in the first place.

Finally, since scenarios are much easier to understand for people outside R&D, it’s easier to involve everyone with an interest in the project. In most organizations, stakeholders such as sales and marketing, customer support and management will find scenarios much easier to grasp than a long (and sometimes boring) list of small test cases.

The challenge is, of course, to find correct and credible stories that both motivate the important stakeholders to participate and at the same time covers as much of the application as possible.

Performance testing can benefit from a scenario approach in many ways. One of the most obvious benefits is that creating scenarios helps to highlight the important application flows that must perform well – just as the coupon code scenario above shows.

Test configurations can then be more focused when we know what the most important areas are. And since scenarios are stories that are easier to understand, it’s also easier for non-technical people to be part of the prioritization work, making sure that first things come first.

Another great benefit that can come specifically from performance testing multiple complex scenarios at the same time is that it can unveil dependencies.

Let’s say that one problem area with an e-commerce web application is slow internal search. While that’s a problem on it’s own, it’s not unlikely that if affects overall database performance. That in turn can affect more important functionality that also uses the database – like registration or checkout.

When applying the concept of scenario testing to your performance testing efforts, here’s a few things keep in mind:

  1. Consider using scenarios in your performance testing. Use tools such as Google Analytics to analyze what paths users take through your site to help you come up with credible and critical scenarios.
  2. Prioritize possible scenarios by thinking how valuable each scenario is. A user browsing your products is good, a user that checks out and pays is better. Make sure you cover the most critical scenarios first by ordering them according to how valuable they are to you.
  3. Consider using Continuous Integration tools such as Jenkins or TeamCity to automate performance scenario testing. An automated test that gives you pass/fail results based on response time is very easy to evaluate.
  4. When the number of scenarios grow, group different ones together based on what part of the system they test. Or group them based on complexity, making sure that all low complexity tests pass before you run the high complexity ones.

———

23582b6This post was written by Robin Gustafsson. Robin is currently CTO at Load Impact. Prior to his role as CTO, he held positions as Solutions Architect, Consultant and lead developer for numerous other tech startups and tech firms, including Ericsson. He also owned and operated his own web development company, Popmint Media, from 2002-2005. Robin specializes is performance testing, software architecture, cloud and distributed computing, as well as Continuous Delivery software development.

Mobile Network Emulation – The Key to Realistic Mobile Performance Testing

Mobile-Testing-Infographic

When was the last time you looked at your website’s browser statistics? If you have, you’ve likely noticed a trend that’s pretty hard to ignore – your users are browsing from a mobile device more than ever before. What was once a small sub-segment of your audience is now growing and representing the majority of your traffic. This may not be so surprising since today mobile usage makes up about 15 percent of all Internet traffic. Basically, if you don’t already have a mobile development strategy, you may already be loosing sales/users due to poor mobile performance. 

Responsive design takes care of your website’s layout and interface, but performance testing for mobile devices makes sure your app can handle hundreds (even thousands) of concurrent users. A small delay in load-time might seem like a minor issue, but slow mobile apps kill sales and user retention. Users expect your apps to perform at the same speed as a desktop app. It seems like a ridiculous expectation, but here are some statistics:

  • If your mobile app fails, 48% of users are less likely to ever use the app again. 34% of users will just switch to a competitor’s app, and 31% of users will tell friends about their poor experience, which eliminates those friends as potential customers. [1]
  • Mobile app development is expected to outpace PC projects by 400% in the next several years. [2]
  • By 2017, over 20,000 petabytes (that’s over 20 million gigabytes!) will be sent using mobile devices. Streaming is the expected primary driver for growth.[3]
  • 60% of mobile failures are due to performance issues and not functional errors. [4]
  • 70% of the performance of a mobile app is dependent on the network. [5]
  • A change in latency from 2ms (broadband) to 400ms (3G network) can cause a page load to go from 1 second to 30 seconds. [6]

These statistics indicate that jumping into the mobile market is not an option but a necessity for any business that plans to thrive in the digital age. You need more than just a fancy site, though. You need a fast fancy site. And the surefire way to guarantee your mobile site/app can scale and deliver a great performance regardless of the level of stress on the system is to load test early and continuously throughout the development process. 

Most developers use some kind of performance testing tools during the development process. However, mobile users are different than broadband users and therefore require a different set of testing tools to make sure they are represented realistically in the test environment. Mobile connections are less reliable; each geographic area has different speeds; latency is higher for mobile clients; and older phones won’t load newer website code. Therefore, you need real-world mobile network emulation and traffic simulation.

Prior to the availability of good cloud performance testing tools, most people thought the solution to performance problems was “more bandwidth” or “more server hardware”. But those days are long over. If you are to stay competitive today, you need to know how to optimize your mobile code. Good performance testing and traffic simulations take more than just bandwidth into account. Network delays, packet loss, jitter, device hardware and browser behavior are also factors that affect your mobile website’s or app’s performance. To properly test your app or site, you need to simulate all of these various situations – simultaneously and from different geographic locations  (i.e. not only is traffic more mobile, its also more global).

You not only want to simulate thousands of calls to your system, you also want to simulate realistic traffic behavior. And, in reality, the same browser, device and location aren’t used when accessing your site or app. That’s why you need to simulate traffic from all over the globe with several different browsers and devices to identify real performance issues. For instance, it’s not unlikely to have a situation where an iPhone 5 on the 4G network will run your software fine, but drop down to 3G and the software fails. Only realistic network emulation covers this type of testing environment.

Finally, simulating real user scenarios is probably the most important testing requirement. Your platform’s user experience affects how many people will continue using your service and how many will pass on their positive experience to others. Real network emulation performs the same clicks and page views as real users. It will help find any hidden bugs that your testing team didn’t find earlier and will help you guarantee that the user experience delivered to the person sitting on a bus using a 3G network is the same as the individual accessing your service seated at their desktop connected through DSL.  

Several years ago, mobile traffic was negligible, but it’s now too prominent to ignore. Simple put, don’t deploy without testing your mobile code!

Check out Load Impact’s new mobile testing functionality. We can simulate traffic generated from a variety of mobile operating systems, popular browsers, and mobile networks – including 3G, GSM and LTE. Test your mobile code now!

What to Look for in Load Test Reporting: Six Tips for Getting the Data you Need

Looking at graphs and test reports can be a befuddling and daunting task – Where should I begin? What should I be looking out for? How is this data useful or meaningful? Hence, here are some tips to steer you in the right direction when it comes to load testing result management.

For example, the graph (above) shows how the load times (blue) increase [1] as the service reaches its maximum bandwidth (red) limit [2], and subsequently how the load time increases even more as bandwidth drops [3]. The latter phenomenon occurs due to 100% CPU usage on the app servers.

When analyzing a load test report, here are the types of data to look for:

  • What’s the user scenario design like? How much time should be allocated within the user scenario? Are they geographically spread?

  • Test configuration settings: is it ramp-up only or are there different steps in the configuration?

  • While looking at the tests results, do you get an exponential growing (x²) curve? Is it an initial downward trend that plateaus (linear/straight line) before diving downwards drastically?

  • How does the bandwidth/requests-per-second look like?

  • For custom reporting and post-test management, can you export your test results to CSV format for further data extraction and analysis?

Depending on the layout of your user scenarios, how much time should be spent within a particular user scenario for all actions (calculated by total amount of sleep time), and how the users are geographically spread, you will likely end up looking at different metrics. However, below are some general tips to ensure you’re getting and interpreting the data you need.

Tip #1: In cases of very long user scenarios, it would be better to look at a single page or object rather than the “user load time” (i.e. the time it takes to load all pages within a user scenario excluding sleep times).

Tip #2: Even though “User Load Time” is a good indicator for identifying problems, it is better to dig in deeper by looking at individual pages or objects (URL) to get a more precise indication of where things have gone wrong. It may also be helpful to filter by geographic location as load times may vary depending on where the traffic is generated from.

Tip #3: If you have a test-configuration with a constant ramp-up and during that test the load time suddenly shoots through the roof, this is a likely sign that the system got overloaded a bit earlier than the results show. In order to gain a better understanding of how your system behaves under a certain amount of load, apply different steps in the test configuration to allow the system to calm down for approximately 15 minutes. By doing so, you will be able to obtain more and higher quality samples for your statistics.

Tip #4: If you notice load times are increasing and then suddenly starting to drop, then your service might be delivering errors with “200-OK” responses, which would indicate that something may have crashed in your system.

Tip #5: If you get an exponential (x²) curve, you might want to check on the bandwidth or requests-per-second. If it’s decreasing or not increasing as quickly as expected, this would indicate that there are issues on the server side (e.g. front end/app servers are overloaded). Or if it’s increasing to a certain point and then plateaus, you probably ran out of bandwidth.

Tip #6: To easily identify the limiting factor(s) in your system, you can add a Server Metrics Agent which reports performance metrics data from your servers. Furthermore, you could possibly export or download the whole test data with information containing all the requests made during the tests, including the aggregated data, and then import and query via MySQL database, or whichever database you prefer.

In a nutshell, the ability to extrapolate information from load test reports allows you to understand and appreciate what is happening within your system. To reiterate, here are some key factors to bear in mind when analyzing load test results:

  • Check Bandwidth

  • Check load time for a single page rather than user load time

  • Check load times for static objects vs. dynamic objects

  • Check the failure rate

  • For Server Metrics – check CPU and Memory usage status

……………….

 

1e93082This article was written by Alex Bergvall, Performance Tester and Consultant at Load Impact. Alex is a professional tester with extensive experience in performance testing and load testing. His specialities include automated testing, technical function testing, functional testing, creating test cases, accessibility testing , benchmark testing, manual testing, etc.

Twitter: @AlexBergvall

Automated Acceptance Testing with Load Impact and TeamCity (New Plugin)

teamcity512

As you know, Continuous Integration (CI) is used by software engineers to merge multiple developers’ work several times a day. And load testing is how companies make sure that code performs well under normal or heavy use.

So, naturally, we thought it wise to develop a plugin for one of the most widely used CI servers out there – TeamCity by JetBrains. TeamCity is used by developers at a diverse set of industry leaders around the world – from Apple, Twitter and Intel, to Boeing, Volkswagen and Bank of America. It’s pretty awesome!

The new plugin gives TeamCity users access to multi-source load testing from up to 12 geographically distributed locations worldwide, advanced scripting, a Chrome Extension  to easily create scenarios simulating multiple typical users, and Load Impact’s Server Metrics Agent (SMA) for correlating the server side impact of testing – like CPU, memory, disk space and network usage.

Using our plugin for TeamCity makes it incredibly easy for companies to add regular, automated load tests to their nightly test suites, and as a result, get continuous feedback on how their evolving code base is performing. Any performance degradation, or improvement is detected immediately when the code that causes it is checked in, which means developers always know if their recent changes were good or bad for performance – they’re guided to writing code that performs well.

 

Here’s how Load Impact fits in the TeamCity CI workflow:CD-TeamCity

 

TeamCity-Button

 

Once you have the plugin installed, follow this guide for installing and configuring the Load Impact plugin for TeamCity. 

Countdown of the Seven Most Memorable Website Crashes of 2013

Let this be a lesson to all of us in 2014. 

Just like every other year, 2013 had its fair share of website crashes. While there are many reasons why a website might fail, the most likely issue is the site’s inability to handle incoming traffic (i.e. load).

Let’s look at some of the most memorable website crashes of 2013 that were caused by traffic overload.

#7. My Bloody Valentine

imgres-3February 2nd, obviously not so alternative shoegaze legends, My Bloody Valentine, decided to release their first album since 1991, and they decided to do so online. They crashed within 30 minutes.

In the end, most of their fans likely got hold of the new album within a day or two and the band, which clearly has a loyal fanbase, probably didn’t end up loosing any sales due to the crash.

#6. Mercedes F1 Team 

Lewis_Hamilton_2013_Malaysia_FP2_2Mercedes F1 team came up with a fairly clever plan to promote their web content. In february, they told fans on Twitter that the faster they retweeted a certain message, the faster the team would reveal sneak preview images of their 2013 Formula One race car.

It worked a little too well. While waiting for the magic number of retweets to happen, F1 fans all over the world kept accessing the Mercedes F1 web page in hopes of being the first to see the new car. Naturally, they brought the website down.

You guys are LITERALLY killing our website!” Mercedes F1 said via Twitter.

#5. NatWest / Royal Bank of Scotland

rbs-nat-west-1-522x293Mercedes F1 and My Bloody Valentine likely benefited from the PR created by their respective crashes, but there was certainly nothing positive to come out of the NatWest/RBS bank website crash. A crash which left customers without access to their money!

In December, NatWest/RBS saw the second website crash in a week when a DDOS attack took them down.

It’s not the first DDOS attack aimed at a bank and it’s probably not the last one either.

#4. Sachin Tendulkar crash

imagesOne of Indias most popular Cricketers, Sachin Tendulkar, also known as the “God of Cricket”, retired in 2013 with a bang! He did so by crashing local ticketing site, kyazoonga.com.

When tickets for his farewell game at Wankhede in Mumbai became available, kyazoonga.com saw a record breaking 19.7 million hits in the first hour, after which the website was promptly brought down.

Fans were screaming in rage on Twitter and hashtag #KyaZoonga made it to the top of the Twitter trending list.

#3. UN Women – White Ribbon campaign 

images-1

It may be unfair to say that this website crash could have been avoided, but it’s definitely memorable.

On November 25th – the International Day for the Elimination of Violence against Women – Google wanted to acknowledge the occasion by linking to the UN Women website from the search giant’s own front page.

As a result, the website started to see a lot more traffic than they’ve been designed for and started to load slowly, even crashing entirely.

Google had given the webmasters at unwomen.org a heads up and the webmasters did take action to beef up their capacity, but it was just too difficult to estimate how much traffic they would actually get.

In the end, the do-no-evil web giant and unwomen.org worked together and managed through the day, partly by redirecting the link to other UN Websites.

Jaya Jiwatram, the web officer for UN Women, called it a win. And frankly, that’s all that really matters when it comes to raising awareness for important matters.

#2. The 13 victims of Super Bowl  XLVII

Super_Bowl_XLVII_logoCoca Cola, Axe, Sodastream, Calvin Klein had their hands full during Super Bowl XLVII. Not so much serving online visitors as running around looking for quick fixes for their crashed websites.

As reported by Yottaa.com, no fewer than 13 of the companies that ran ads during Super Bowl saw their websites crash just as they needed them the most.

If anything in this world is ever going to be predictable, a large spike of traffic when you show your ad to a Super Bowl audience must be one those things.

#1. healthcare.gov

imgres-5The winner of this countdown shouldn’t come as a surprise to anyone. Healthcare.gov came crashing down before it was even launched.

It did recover quite nicely in the last weeks of 2013 and is now actually serving customers. If not exactly as intended, at least well enough for a total of 2 million americans to enroll.

But without hesitation, the technical and political debacle surrounding healthcare.gov makes it the most talked about and memorable website crash in 2013.

Our friends over at PointClick did a great summary of the Healthcare.gov crash. Download their ebook for the full recap: The Six Critical Mistakes Made with Healthcare.gov

There’s really nothing new or surprising about the website crashes of 2013. Websites have been developed this way for years – often with the same results. But there are now new methodologies and tools changing all that.

It isn’t like it used to be; performance testing isn’t hard, time consuming or expensive anymore. One just needs to recognize that load testing is something that needs to be done early and continuously throughout the development process. It’s not optional anymore. Unfortunately, it seems these sites found that out the hard way. A few of which will likely learn the lesson again in 2014.

Our prediction for 2014 is more of the same. However, mainstream adoption of developmental methodologies such as Continuous Integration and Delivery, which advocate for early and continuous performance testing, are quickly gaining speed.

A Google search trend report for the term, DevOps, clearly shows the trend. If the search trends are any indication of the importance being given to proactive performance testing by major brands, app makers and SaaS companies, we might only see half the number of super bowl advertiser site crashes in 2014 as we did last year.

DevOps Trend

Update following Superbowl XLVIII: According to GeekBeat, the Maserati website crashed after their ad featured their new Maserati Ghibli. And monitoring firm, OMREX, found two of the advertiser websites had uptime performance issues during the game – Coca-Cola and Dannon Oikos.

How did the Obama Administration blow $400M making a website?

By doing software development and testing the way it’s always been done.

There is nothing new in the failure of the Obamacare site. Silicon Valley has been doing it that way for years. However new methodologies and tools are changing all that.

There has been a huge amount of press over the past several weeks about the epic failure of the Obamacare website. The magnitude of this failure is nearly as vast as the righteous  indignation laid at the feet of the administration about how this could have been avoided if only they had done this or that.  The sub text being that this was some sort of huge deviation from the norm. The fact is nothing could be farther form the truth. In fact, there should be a sense of déjà-vu-all-over-again around this.

The record of large public sector websites are one long case study in epic IT train wrecks.

In 2012 the London Olympic Ticket web site crashed repeatedly and just this year the California Franchise Tax Board’s new on-line tax payment system went down and stayed down – for all of April 15th.

So, this is nothing new.

As the Monday morning quarterbacking continues in the media one of my favorite items was a CNN segment declaring that had this project had been done in the lean mean tech mecca that is Silicon Valley, it all would have turned out differently because of the efficiency that we who work here are famous for. And as someone who has been making online software platforms in the Bay Area for the past decade, I found that an interesting argument, and one worth considering and examining.

Local civic pride in my community and industry generates a sort of knee jerk reaction. Of course we would do it better/faster/cheaper here. However if you take a step back and really look honestly at how online Software as a Service (SaaS) has been done here over most of the past 20 or so years that people have been making websites, you reach a different conclusion. Namely, it’s hard to fault Obama Administration. They built a website in a way that is completely in accordance with the established ways that people have built and tested online Software platforms for most of the past decade in The Valley.

The only problem is it doesn’t work.  Never has.

The problem then isn’t that they did anything out of the ordinary.  On the contrary.  They walked a well worn path right off a cliff very familiar to the people I work with. However, new methodologies and tools are changing that. So, the fault is that they didn’t see the new path and take that instead.

I’d like to point out from the start that I’ve got no special knowledge about the specifics of HealthCare.gov. I didn’t work on this project.  All of what I know is what I’ve read in the newspapers. So starting with that premise I took a dive into a recent New York Times article with the goal of comparing how companies in The Valley have faced similar challenges, and how that would be dealt with using the path not taken, of modern flexible — Agile in industry parlance — software development.

Fact Set:

  • $400 million
  • 55 contractors
  • 500 million lines of code 

$400 million — Lets consider what that much money might buy you in Silicon Valley. By December of 2007 Facebook had taken in just under $300 million in investment and had over 50 million registered users — around the upper end of the number of users that the HealthCare.gov site would be expected to handle.  That’s big. Comparisons between the complexity of a social media site and a site designed to compare and buy health insurance are imperfect at best. Facebook is a going concern and arguably a much more complex bit of technology.  But it gives you the sense that spending that much to create a very large scale networking site may not be that extravagant. Similarly Twitter had raised approximately $400 million by 2010 to handle a similar number of users. On the other hand eBay, a much bigger marketplace than HealthCare.gov will ever be, only ever asked investors for $7 million in funding before it went public in 1998.

55 contractors — If you assume that each contractor has 1,000 technical people on the project you are taking about a combined development organization about the size of Google (54,000 employees according to their 2013 Q3 Statement) for HeathCare.gov. To paraphrase the late Sen. Lloyd Benson ‘I know Google, Google is a friend of mine and let me tell you… you are no Google’

500 million lines of code – That is a number of astronomical proportions. It’s like trying to image how many matches laid end to end would reach the moon (that number is closer to 15 billion but 500 million matchsticks will take you around the earth once). Of all the numbers in here, that is the one that is truly mind boggling.  So much to do something relatively simple. As one source in the article points out, “A large bank’s computer system is typically about one-fifth that size.”  Apples latest version of the OSX operation system for computers has approximately 80 million lines of code.  Looking at it another way, that is a pretty good code to dollar ratio. The investors in Facebook probably didn’t get 500 million lines of code for their $400 million. Though, one suspects, they might have been pretty appalled if they had.

So if the numbers are hard to mesh with Silicon Valley, what about the process — the way in which they went about doing this, and the resulting outcome?  Was the experience of those developing this similar, with similar outcomes, to what might have taken place in Silicon Valley over the past decade or so? And, how does the new path compare with this traditional approach?

The platform was ”70 percent of the way toward operating properly.”   

Then – In old school Silicon Valley there was among a slew of companies the sense that you should release early, test the market, and let the customers find the bugs.

Now – It’s still the case that companies are encouraged to release early, and if your product is perfect it was thought that you waited too long to release.  The difference is that the last part — let the customers find the bugs — is simply not acceptable, excpet for the very youngest  beta test software,. The mantra with modern developers is, fail early and fail often.  Early means while the code is still in the hands of developers, as opposed to the customers.  And often means testing repeatedly — ideally using automated testing. This, as opposed to manual tests, that were done reluctantly, if at all.

“Officials modified hardware and software requirements for the exchange seven times… As late as the last week of September, officials were still changing features of the Web site.” 

Then —  Nothing new here. Once upon a time there was a thing called the Waterfall Development Method. Imagine a waterfall with different levels each pouring over into the next, each level of this cascade represented a different set of requirements each dependent on the level above it and the end of the process was a torrent of code and software the would rush out to the customer in all its complex feature-rich glory called The Release. The problem was that all these features and all this complexity took time — often many months for a major release, if not longer. And over time the requirements changed. Typically the VP of Sales or Business Development would stand up in a meeting and declared that without some new feature that was not on the Product Requirement Document, some million-dollar deal would be lost. The developers, not wanting to be seen as standing in the way of progress, or being ordered to get out of the way of progress, would dutifully add the feature or change a requirement, thereby making an already long development process even longer. Nothing new here.

Now — The flood of code that was Waterfall has been replaced by something called Agile, which as the name implies, allows developers to be flexible, and expect that the VP of Sales will rush in and say, “Stop the presses!  Change the headline!”  The Release is now broken down into discrete and manageable chunks of code in stages that happen on a regular weekly, if not daily, schedule. Software delivery is now designed to accommodate the frequent and inherently unpredictable demands of markets and customers. More importantly, a problem with software can be limited in scope to a relatively small bit of code with where it can be quickly found and fixed.

“It went live on Oct. 1 before the government and contractors had fully tested the complete system. Delays by the government in issuing specifications for the system reduced the time available for testing.”

Then — Testing was handled by the Quality Assurance (QA) team. These were often unfairly seen as the least talented of developers who were viewed much like the Internal Affairs cops in a police drama. On your team in name only, and out to get you.  The QA team’s job was to find mistakes in the code and point them out publicly, and make sure they got fixed. Not surprisingly, many developers saw little value in this. As I heard one typically humble developer say, “Why do you need to test my code? It’s correct.The result of this mindset was that as the number of features increase, and time to release remained unchanged, testing got cut.  Quality was seen as somebody else’s problem.  Developers got paid to write code and push features.

Now — Testing for quality is everybody’s job. Silos of development, operations and QA are being combined into integrated Dev/Ops organizations in which software is be continuously delivered and new features and fixes are continuously integrated into live websites. The key to this is process — known by the refreshingly straight name of  Continuous Delivery — is automated testing that frees highly skilled staff from the rote mechanics of doing testing, and allows them to focus on making a better product, all the while assuring the product is tested early often and continuously. A Continuous Delivery product named Jenkins is currently one of the most popular and fastest growing open source software packages.

“The response was huge. Insurance companies report much higher traffic on their Web sites and many more callers to their phone lines than predicted.”

Then — The term in The Valley was victim of your own success. This was shorthand for not anticipating rapid growth or positive response, and not testing the software to ensure it had the capacity and performance to handle the projected load and stress that a high volume of users places on software and the underlying systems. The reason for this was most often not ignorance or apathy, but that the software available at the time was expensive and complicated, and the hardware needed to do these performance tests was similarly expensive and hard to spare.  Servers dedicated solely for testing was a luxury that was hard to justify and often appropriated for other needs.

Now — Testing software is now often cloud-based, on leased hardware, which means that anybody with a modicum of technical skill and an modest amount of money can access  tools that would have been out of reach of all but the largest, most sophisticated software engineering and testing teams with extravagant budgets. Now, not only is there no excuse for not doing it, is in fact inexcusable. Software is no longer sold as licensed code that comes on a CD.  It is now a service that is available on demand — there when you need it. Elastic.  As much as you need, and only what you need.  And, low entry barrier.  You shouldn’t have to battle your way through a bunch of paperwork and salespeople to get what you need.  As one Chief Technical Officer at a well know Bay Area start-up told me,  “If I proposed to our CEO that I spend $50,000 on any software, he’d shoot me in the head.” Software in now bought as service.

It’s far from clear at this point in this saga as to the what, how and how much it will take to fix the HealthCare.gov site. What is clear is that while the failure should come as no surprise give the history of government, and software development in general that doesn’t mean that the status quo need prevail for every. It’s a fitting corollary to the ineffective process’ and systems in the medical industry that the healthcare.org itself is trying to fix. If an entrenched industry like software development and Silicon Valley can change the way it does business and produce its services faster, better and at a lower cost; then maybe there is hope for the US health care industry doing the same.

By: Charles Stewart (@Stewart_Chas)

Performance Testing Versus Performance Tuning

Performance testing is often mistaken for performance tuning. The two are related, but they are certainly not the same thing. To see what these differences are, let’s look at a quick analogy.

Most governments mandate that you bring your vehicles to the workshop for an inspection once a year. This is to ensure that your car meets the minimum safety standards that have been set to ensure it is safe for road use. A website performance test can be likened to a yearly inspection – It ensures that your website isn’t performing terribly and should perform reasonably well under most circumstances.

When the inspection shows that the vehicle isn’t performing up to par, we start running through a small series of checks to see how to get the problem solved in order to pass the inspection. This is similar to performance tuning, where we shift our focus to discovering what is necessary to making the application perform acceptably.

Looking in depth at the performance test results helps you to narrow down the problematic spots so you can identify your bottlenecks quicker. This in turn helps you to make optimization adjustments cost and time efficient.

Then we have the car enthusiasts. This group constantly works toward tuning their vehicle for great performance. Their vehicles have met the minimum performance criteria, but their goal now is to probably make their car more energy-efficient, or perhaps to run faster. Performance tuning goals are simply that – You might be aiming to reduce the amount of resources consumed to decrease the volume of hardware needed, and/or to get your website to load resources quicker.

Next, we will talk about the importance of establishing a baseline when doing performance tuning.

Tuning your website for consistent performance

Now that we know the difference between performance testing and performance tuning, let’s talk about why you will need a controlled environment and an established baseline prior to tuning web applications.

The importance of a baseline: Tuning your web application is an iterative process. There might be several factors contributing to poor website performance, and it is recommended to make optimization adjustments in small steps in a controlled test environment. Baselines help to determine whether an adjustment to your build or version improves or declines performance. If the conditions of your environment are constantly changing or too many large changes are made at once, it will be difficult to see where the impact of your optimization efforts come from.

To establish a baseline, try tracking specific criteria such as page load times, bandwidth, requests per second, memory and CPU usage. Load Impact’s server metrics helps to combine all these areas in a single graph from the time you run your first performance test. Take note of how these changes improve or degrade when you make optimization improvements (i.e. if you have made hardware upgrades).

Remember that baselines can evolve over time, and might need to be redefined if changes to the system have been made since the time the baseline was initially recorded.If your web application is constantly undergoing changes and development work, you might want to consider doing small but constant tests prior to, for instance, a new fix being integrated or a new version launch.

As your product development lifecycle changes, so will your baseline. Hence, doing consistent testing prior to a release helps save plenty of time and money by catching performance degradation issues early.

There is an increasing number of companies adopting a practice known as Continuous Integration. This practice helps to identify integration difficulties and errors through a series of automated checks, to ensure that code deployment is as smooth and rapid as possible.

If this is something that your company already practices, then integrating performance tuning into your product delivery pipeline might be as simple as using Load Impact’s Continuous Delivery Jenkins plugin. A plugin like this allows you to quickly integrate Jenkins with our API to allow for automated testing with a few simple clicks.

By Chris Chong (@chrishweehwee)

About Load Impact

Load Impact is the leading cloud-based load testing software trusted by over 123,000 website, mobile app and API developers worldwide.

Companies like JWT, NASDAQ, The European Space Agency and ServiceNow have used Load Impact to detect, predict, and analyze performance problems.
 
Load Impact requires no download or installation, is completely free to try, and users can start a test with just one click.
 
Test your website, app or API at loadimpact.com

Enter your email address to follow this blog and receive notifications of new posts by email.