Perfecting Autoscaling & Sizing : Part 2

b"h

Goal: 100% 200 OK

howto scope...

Load Balancing Density

* What is it? spreading out the requests to the application horizontally on many micro instances vs lesser on larger instances...

* Why is it useful? it is more cost effective to run the calculation on many horizontally scaled micro instances, then to run on lesser, vertically scaled, larger instances, so always aim for horizontal.

hence, segway into...

Warming

* Warming the application cluster via autoscaling ahead of the computation.

In my case i use a Lambda function that gets triggered by an SNS Topic alarm, where a SQS Queue is subscribed to the topic, and picks up the json message to scale, which Lambda uses to set Elastic BeanStalk minimum and maximum Autoscaling values ahead of the computation.

* Scaling back down after calc, and relying on CDN caching to take the burden off of origin.

* Least amount of scaling fluctuation during the time of the calculation.

* Allows for better control of minimum compute power required to calc.

* Reduces errors caused by "time to scale".

Ex: if normally you would run on 3 micro and handle 30 requests per minute, and in the next minute, when we start the calc, Requests per minute jump to 250, and it takes Autoscaling 5 minutes to initialize and bootstrap a web application server aka "Warming", then that is 5 minutes where those poor 3 micro instances are taking 8+ times more the load then they are normally able to handle resulting in 500 errors aka Service not available because my application server is choking due to the sudden flood...

Reducing the Autoscaling range, results in less scaling fluctuation time.

TBC...workin in progress...