## Executive Summary
On December 23rd, 2017 between 12:02 - 12:06 PM UTC and on January 6th 08:32 - 08:52 AM UTC, the Accedo One API experienced higher response times and in some cases returned errors. The root cause was a sustained high load of traffic (150% increase from baseline traffic levels) in combination with an additional burst of even higher traffic from the already high levels (approximately 800% increase from baseline traffic levels). While the system is designed in a way to sustain both heavy load as well as burst traffic, this unfortunate combination had an effect on the caching layer, causing it to hit a limit on its network interface which is responsible for transmitting and receiving data in and out of the cache. This in turn increased the amount of load on the database layer, which resulted in database-intense requests to experience longer response time and in some cases returned errors due to time-outs.
##Preventive Steps
We have identified and are working on a number of improvements that will allow us to serve traffic as usual even during these types of extreme scenarios. Several of these measures are being rolled out in the next few days, such as improved cache cluster sharding and enhanced connection management on the API under adverse conditions such as elevated response times. We are also working on means of protecting the shared service via multi-region availability, where the load will be shared across several clusters in different regions, something that will be communicated more about later this year.
We do apologize for these service degradations and will continue tirelessly to put service scalability, -stability and -durability at the top of our priorities throughout 2018.