Scaling Policies and Decision Strategies

Auto Scaling behavior is defined by scaling policies.

There are three main strategies:

1. Target Tracking Scaling

You define a metric target (e.g., CPU at 50%). AWS automatically adjusts capacity to maintain that target.

This is the recommended approach for most systems.

Architecturally, this creates a self-balancing system that maintains steady performance.

2. Step Scaling

Scaling actions occur in steps depending on metric breach levels.

Example:

  • CPU > 60% → add 1 instance
  • CPU > 80% → add 3 instances

This gives fine-grained control but requires careful tuning.

3. Scheduled Scaling

Used for predictable traffic patterns.

Example:

  • Scale to 10 instances at 8 AM
  • Scale to 3 instances at midnight

Useful for business-hour workloads.


Cooldown and Warmup

Scaling is not instant. New instances need:

  • Boot time
  • Application startup
  • Health check validation

If scaling decisions ignore warmup time, you risk oscillation (rapid scaling in and out).

Production insight: Always configure instance warmup and use load balancer health checks.

Architectural Implications

Improper scaling policies cause:

  • Thrashing
  • Increased costs
  • Latency spikes

Well-designed policies:

  • Protect user experience
  • Maintain steady state
  • Optimize cost

Scaling policy design is part of system reliability engineering.

Scaling Strategies

Question 1 of 2

0/2

Which scaling policy automatically maintains a metric at a defined target?

In this section, I learned:

0 of 4 completed

Choose your language

Select your preferred language for the site