AWS Auto Scaling Terms

Load Balancing

Response Timeout:

  • Allow a response timeout period of 5 seconds for the instance to respond
  • If the load balancer fails to connect with the instance at the specified port within the configured response timeout period, the instance is considered unhealthy.
  • Make sure this value is not too low. When the server is under heavy loading, it may takes longer time to response. Low response timeout will cause the instance to be terminated unexpectedly.

HealthCheck Interval:

  • Amount of time between health checks
  • Should be greater than Response Timeout

    Healthy Threshold:

    • Make sure this value is not too high
    • When an instance is registered with Elastic Load Balancing, it will not be considered healthy until the number of successful health checks that define a healthy state are completed
    • If you set a long interval for health checks and/or a high healthy threshold, it will take more time for instances to start receiving traffic from Elastic Load Balancing

    Unhealthy Threshold:

    • Make sure this value is not too low. Otherwise the instance may stopped unexpectedly.

    Reponse Timeout: 14 seconds
    Health Check Interval: 15 seconds
    Unhealthy Threshold: 10
    Healthy Threshold: 2


    Auto Scaling

    Health Check Grace Period:

    • The amount of time, in seconds, after an EC2 instance comes into service that Auto Scaling starts checking its health. During this time, any health check failures for the instance are ignored.
    • This parameter is required if you are adding an ELB health check. Frequently, new instances need to warm up, briefly, before they can pass a health check. To provide ample warm-up time, set the health check grace period of the group to match the expected startup period of your application.
    • This is an important consideration that prevents AWS from adding too many servers too quickly
    • Make sure the health check grace period is longer then instance startup time. Otherwise the new instance can be terminated before it finishes startup.


    • The average CPU usage of all the instances

    Default cooldown period:

    • Apply to any scaling activity that occurs within the Auto Scaling group.

    Scaling-Specific Cooldown:

    • Apply to a specific scaling policy. Override the default cooldown period.
    • Instance usually takes a couple of minutes to launch. During that time, the CloudWatch alarm could continue to fire, resulting in Auto Scaling launch another instance each time the alarm goes off. This is where the cooldown period comes into effect. With a cooldown period in place, Auto Scaling launches an instance and then suspends any scaling activities until a specific amount of time elapses.
    • Scale down policy terminates instances, less time is needed to determine whether to terminate additional instances in the Auto Scaling group. The default cooldown period of 300 seconds is too long—costs can be reduced by applying a scaling-specific cooldown period of 180 seconds.
    • When multiple instances are launched at the same time, the cooldown period (either the default cooldown or the scaling-specific cooldown) take effect starting when the last instance launches.

    Health Check Grace Period: 300
    Default cooldown period: 300
    Scale up policy: CPUUtilization >= 50 for 60 seconds, Add 4 instances, 300 seconds cooldown time
    Scale down policy: CPUUtilization <= 10 for 60 seconds, Remove 1 instance, 180 seconds cooldown time