Big data

Big data – what is that?

You might have heard the term, but its actual meaning depends a bit on who you are. Big data is essentially a name for data sets that are much larger than they used to be. It is a term that describes how modern applications often generate enormous amounts of data, as compared to their fairly recent predecessors. It can be the Large Hadron Collider, a weather satellite or a popular social networking site but they all tend to generate huge amounts of data. Data that needs to be stored somewhere.

The rapid development of harddrive technology and network communications has enabled data sets to also grow very rapidly. This means that transferring the bits and bytes, and storing them on some kind of physical media is not a big problem. What is a bit of a problem, on the other hand, is the software that is used to store and retrieve data. Like SQL database applications. These were excellent when all you wanted to do was store the name and address of your company’s 1,000 customers in an organized manner, so that you can then perform searches for all customers located in a certain city, etc. However, today there are companies like Facebook that has to store huge amounts of data – data that they want fast access to as they need to quickly dig up those 14 pictures that are yours from a collection of over 40 billion uploaded pictures. This was not what the old SQL databases were designed for, and in general, many of the old software tools and applications for handling data simply are not up to working with these new and huge data sets.

Enter technologies such as NoSQL-databases – Cassandra, Voldemort, CouchDB, MongoDB, Neo4J, Dynamo, Redis, Memcached etc, etc. There are now a wide range of different systems that can store large amounts of data in an efficient manner. You always get some kind of trade-off, of course, and the end result is always a more complex design of your application, but it allows you to scale the size of your data sets to previously unimaginable levels. The development of these systems is progressing very rapidly, and data sets are growing at the same furious pace as a result. Load Impact uses Apache Cassandra to store load test results, providing us with a flexible way to scale our system as the number of users and the amount of test data we store increases. Currently, our test result database is growing at a rate of several gigabytes per day.

Cloud computing is another enabler of big data. While previously it was difficult to scale your infrastructure to be able to handle large data sets without making huge upfront investments in your infrastructure, today you can rent the infrastructure as, and when, you need it. An application that quickly has to perform a complex calculation on a large set of data, but do it only occasionally, would previously have been too costly to run because of the infrastructure cost. Today you can rent a thousand Amazon EC2 servers for an hour and pay only a couple of tens of dollars to do so.

For us, big data is a positive development as it increases the demand for large-scale load testing. Online services get larger and more resource-intensive, and there is money to be saved on optimizing your solution to use the least resources possible. Even more important, in many cases, is optimizing for speed so that people (or machines!) using your online application, site or service, will not choose a competitor over you. Speed is becoming a critical competitive advantage, and load testing is an important test method for those who want to ensure that their site or application is fast under all circumstances. Traditional load testing solutions are often unable to scale up the load levels to what is required to properly stress a large site or application out there, so we see a clear trend that people are becoming more and more interested in cloud-based, online load testing.

If you want to know more, a good start is the Wikipedia article on big data:

http://en.wikipedia.org/wiki/Big_data

 

 

 

About Load Impact

Load Impact is the leading cloud-based load testing software trusted by over 123,000 website, mobile app and API developers worldwide.

Companies like JWT, NASDAQ, The European Space Agency and ServiceNow have used Load Impact to detect, predict, and analyze performance problems.
 
Load Impact requires no download or installation, is completely free to try, and users can start a test with just one click.
 
Test your website, app or API at loadimpact.com

Enter your email address to follow this blog and receive notifications of new posts by email.