<img height="1" width="1" style="display:none;" alt="" src="https://dc.ads.linkedin.com/collect/?pid=1005900&amp;fmt=gif">

Insights

Why do so many websites crash on Black Friday?

tesco website crash

Yes, retailers will say that they have invested in extra servers, or even cloud capacity, to deal with that extra demand that Black Friday will bring. They will proudly say they've invested in new software stacks, caching, and various other technologies to handle capacity. And they are correct, they have invested; but crucially, they haven't invested wisely.very website crash

Modern e-commerce websites are increasingly complex beasts. Typically they have firewalls for security, load-balancers to spread demand, web-servers to handle online traffic, application servers to run the software and finally a large database server. Often there are other specialist systems such as search engines and stock management systems (e.g. Hybris). There is also connectivity to other systems or web services such as credit checking (e.g. Experian), postcode lookups, historical orders and, most importantly, payment systems (e.g. Worldpay).

All of these components together make for a complex system. This complexity means it is difficult to accurately predict if a website can handle Black Friday peaks. Download our Guide to Ensuring Website Performance During Trading Peaks here  and prepare for your peaks effectively.

Furthermore, a lack of insight on the performance limitations of complex systems leads to bad investment decisions. Often the normal response is to attempt to mitigate risk by throwing more servers at a complex system which simple cannot be scaled up in this way. This just causes wasted investment in hardware and mitigates very few performance risks. So huge investment and it still cannot cope.

In our experience nearly all bottlenecks are in the software, rather than the hardware. Without information about the internal performance characteristics of each component within a complex system, and how they work together, there cannot be an understanding of how the holistic system operates at scale. To do this there needs to be a much greater insight.

game website crash tweet

Large volumes of data need to be analysed to find distinct patterns which indicate software bottlenecks. From this a holistic picture can be created of the overall system performance under different scenarios. Uncovering these bottlenecks is the challenge.

If the answer isn’t just ‘more hardware’, what is it?

At Capacitas we recommend a three-pronged iterative approach:

  • Measure key metrics in each component system
  • Load test the system to highlight specific risk areas
  • Model the holistic system so different scenarios can be evaluated

By repeatedly load testing the live system, and analysing each set of results, improvements can be identified and implemented. This new knowledge of the system and its weaknesses allows investment to be made wisely.

This isn’t easy and it takes several months. But it is necessary if you want to avoid a website crash on Black Friday. 

Guide to ensuring website performance during trading peaks 

  • There are no suggestions because the search field is empty.
Filter by Tags:
SRE
AWS
TSD
cto