Skip to content

Downscaling

Intelligent downscaling is essential for optimizing Kubernetes costs without compromising application performance. This guide explains Stackbooster.io's approach to reducing cluster size when resources are underutilized.

How Stackbooster.io Downscaling Works

Stackbooster.io employs sophisticated algorithms to safely reduce cluster size when extra capacity is no longer needed:

Smart Capacity Reduction

Our platform goes beyond simple utilization thresholds:

  • Analyzes sustained underutilization patterns across multiple metrics
  • Considers pod distribution and resource requirements
  • Identifies nodes that can be safely drained and removed
  • Avoids disruptive scaling that could impact application performance

Workload Consolidation

Before removing nodes, Stackbooster.io optimizes pod placement:

  • Identifies pods that can be relocated to increase node efficiency
  • Uses bin-packing algorithms to maximize resource utilization
  • Respects pod affinity/anti-affinity rules and node selectors
  • Considers performance implications of pod migrations

Graceful Node Decommissioning

When removing nodes, our system ensures minimal disruption:

  • Cordons nodes to prevent new pod scheduling
  • Gradually drains pods with appropriate termination grace periods
  • Monitors pod migrations to ensure successful rescheduling
  • Aborts the process if any critical issues are detected

Configuring Downscaling

Basic Configuration

To set up basic downscaling parameters:

  1. Navigate to your cluster in the Stackbooster.io dashboard
  2. Select "Scaling Configuration" > "Downscaling"
  3. Configure the following settings:
    • Underutilization Threshold: Resource level that triggers downscaling (default: 40%)
    • Scale-down Delay: Time a node must be underutilized before removal (default: 10 minutes)
    • Max Scale-down Rate: Maximum nodes to remove in a single scaling action
    • Workload Respect Level: How strictly to honor workload constraints

Advanced Settings

For more granular control, configure:

Node Protection Rules

Protect specific nodes from downscaling:

  1. Navigate to "Node Management" > "Protection Rules"
  2. Create rules based on:
    • Node labels or names
    • Time-based protection windows
    • Workload importance running on nodes

Pod Disruption Budgets

Honor Kubernetes PodDisruptionBudgets for controlled downscaling:

  1. Navigate to "Workload Settings" > "Disruption Controls"
  2. Configure how strictly to adhere to PDBs:
    • Strict: Never violate PDBs, even if it means delaying downscaling
    • Balanced: Respect PDBs but proceed after reasonable waiting period
    • Relaxed: Consider PDBs as guidelines only

Time-Based Rules

Create rules that modify downscaling behavior based on time:

  1. Go to "Scheduling Rules"
  2. Create rules with:
    • Time windows for aggressive downscaling (e.g., nights, weekends)
    • Special handling for maintenance windows
    • Freeze periods when downscaling should be avoided

Downscaling Strategies

Stackbooster.io offers several downscaling strategies to match your operational needs:

Balanced (Default)

  • Moderate approach to node removal
  • Waits for sustained underutilization before taking action
  • Considers both resource efficiency and operational stability

Aggressive Cost Optimization

  • Prioritizes cost savings with more rapid downscaling
  • Removes nodes more quickly when underutilized
  • Maintains minimal excess capacity
  • Best for non-critical environments or dev/test clusters

Conservative

  • Takes a cautious approach to node removal
  • Requires longer periods of underutilization before downscaling
  • Maintains higher buffer capacity
  • Ideal for production environments with stringent reliability requirements

Custom

  • Define your own parameters for underutilization thresholds and timing
  • Create different strategies for different environments
  • Implement special handling for specific use cases

Best Practices

Determining Optimal Underutilization Thresholds

The ideal underutilization threshold depends on your workload characteristics:

  • Stable workloads: 30-40% is typically appropriate
  • Variable workloads: 40-50% provides better buffer for fluctuations
  • Critical applications: 50-60% ensures capacity for unexpected demands

Scheduling Downscaling Windows

For predictable cost optimization:

  1. Identify low-usage periods in your application traffic patterns
  2. Create scheduled downscaling windows during these periods
  3. Configure more aggressive thresholds during these times
  4. Return to normal settings when usage typically increases

Handling Stateful Workloads

When your cluster runs stateful applications:

  1. Create node protection rules for nodes running stateful workloads
  2. Configure longer grace periods for pods with persistent storage
  3. Set stricter PDB adherence for database or caching systems
  4. Consider manual approval for downscaling actions affecting critical stateful services

Monitoring Downscaling Performance

To ensure your downscaling is effective:

  1. Monitor the "Scaling Performance" dashboard

  2. Review metrics such as:

    • Cost savings from downscaling actions
    • Application performance impact during node removals
    • Failed pod migrations during node draining
    • Frequency of scaling reversals (down then up quickly)
  3. Adjust configuration based on observations:

    • Increase buffer if applications experience resource pressure
    • Decrease scale-down delay if nodes remain idle too long
    • Modify protection rules if certain workloads are affected

Troubleshooting

Common Downscaling Issues

Nodes Not Scaling Down Despite Low Utilization

Potential causes and solutions:

  • DaemonSets: Check if DaemonSets are blocking node removal
  • PodDisruptionBudgets: Review if strict PDBs are preventing pod evictions
  • Node Selectors/Taints: Verify if pods require specific nodes
  • Protection Rules: Check for active protection rules blocking downscaling

Workload Disruption During Downscaling

If applications are negatively impacted:

  • Increase the scale-down delay to ensure stability before node removal
  • Configure stricter PDB adherence in the downscaling settings
  • Add protection for sensitive workload nodes
  • Increase the pod eviction timeout to allow proper termination

Rapid Scale Down/Up Cycles

If your cluster shows "thrashing" between scaling down and up:

  • Increase the buffer threshold to maintain more spare capacity
  • Lengthen the scale-down evaluation period
  • Implement cooldown periods between scaling actions
  • Review your application's resource requests for accuracy

Advanced Topics

Cost-Aware Node Selection

When determining which nodes to remove, Stackbooster.io considers:

  • Instance pricing and billing cycle position
  • Reserved instance coverage and commitment
  • Spot instance interruption probability
  • Node age and maintenance status

To optimize this selection:

  1. Navigate to "Cost Settings" > "Node Removal Priorities"
  2. Configure priorities based on your cost structure
  3. Adjust weighting between different cost factors

Integration with Cluster Autoscaler

If you're also using Kubernetes Cluster Autoscaler:

  1. Navigate to "Integration Settings" > "Kubernetes Autoscaler"
  2. Configure coordination mode:
    • Replace: Let Stackbooster.io handle all scaling (recommended)
    • Complement: Define separate responsibilities
    • Observe: Run alongside but don't interfere with existing autoscaler

Downscaling with Spot Instances

For clusters using Spot instances:

  1. Configure "Spot Management" settings
  2. Set preferences for:
    • Spot vs On-Demand priority in downscaling
    • Handling of Spot termination notices
    • Replacement strategies when Spot availability changes

By implementing these downscaling strategies, your Kubernetes clusters will maintain optimal size and resource utilization, reducing costs while preserving application performance and reliability.

Released under the MIT License. Contact us at [email protected]