Skip to content

Up-scaling

Effective up-scaling is crucial for maintaining application performance while controlling costs. This guide explains how Stackbooster.io's intelligent up-scaling works to ensure your Kubernetes clusters have sufficient resources when demand increases.

How Stackbooster.io Up-scaling Works

Stackbooster.io uses a multi-faceted approach to determine when and how to scale up your Kubernetes clusters:

Predictive Scaling

Unlike standard Kubernetes autoscaling that reacts to resource saturation, our predictive scaling:

  • Analyzes historical usage patterns to anticipate needs
  • Identifies cyclic patterns (daily, weekly, monthly)
  • Preemptively adds capacity before it's critically needed
  • Reduces application lag caused by reactive scaling

Multi-Signal Analysis

Our platform considers multiple signals when making scaling decisions:

  • CPU Utilization: Beyond simple thresholds, we analyze trends and rate of change
  • Memory Usage: Both at node and pod levels, including projected memory growth
  • Pod Scheduling Events: Failed scheduling due to resource constraints
  • Application Metrics: Custom metrics from your applications when configured
  • External Factors: Scheduled events, deployments, and known traffic patterns

Smart Instance Selection

When scaling up, our algorithm selects the optimal instance types:

  • Analyzes the specific workload resource needs
  • Considers price-performance ratio across instance families
  • Evaluates availability of Spot instances for non-critical workloads
  • Makes placement decisions based on workload affinity and anti-affinity

Configuring Up-scaling

Basic Configuration

To set up basic up-scaling parameters:

  1. Navigate to your cluster in the Stackbooster.io dashboard
  2. Select "Scaling Configuration" > "Up-scaling"
  3. Configure the following settings:
    • Headroom Percentage: Buffer capacity to maintain (default: 15%)
    • Max Scale-Up Rate: Maximum nodes to add in a single scaling action
    • Predictive Scaling: Enable/disable predictive scaling
    • Response Sensitivity: How quickly to respond to increased resource demands

Advanced Settings

For more granular control, you can configure:

Workload-Specific Settings

Configure different scaling parameters for various workloads:

  1. Navigate to "Workload Settings"
  2. Select a workload or namespace
  3. Configure custom scaling parameters:
    • Priority level for resource allocation
    • Minimum guaranteed resources
    • Maximum scale factor

Time-Based Rules

Create rules that modify scaling behavior based on time:

  1. Go to "Scheduling Rules"
  2. Create a new rule with:
    • Time window (e.g., business hours, weekends)
    • Custom headroom settings for the defined period
    • Special handling for known high-traffic events

Custom Metrics

Integrate application-specific metrics for scaling decisions:

  1. Navigate to "Custom Metrics"
  2. Configure your Prometheus or CloudWatch metrics
  3. Set thresholds and scaling factors based on these metrics

Up-scaling Strategies

Stackbooster.io offers several up-scaling strategies to choose from based on your needs:

Balanced (Default)

  • Maintains moderate headroom to handle normal traffic fluctuations
  • Scales gradually to avoid over-provisioning
  • Balances cost and performance considerations

Performance-Focused

  • Maintains higher headroom for rapid response to traffic spikes
  • Scales more aggressively when demand increases
  • Prioritizes application performance over cost optimization

Cost-Focused

  • Maintains minimal headroom to maximize resource efficiency
  • Scales more conservatively, potentially accepting some scheduling delays
  • Prioritizes cost savings over immediate scaling

Custom

  • Define your own parameters for headroom, scaling speed, and instance selection
  • Create different strategies for different environments or workloads
  • Implement special handling for specific use cases

Best Practices

Determining Optimal Headroom

The ideal headroom percentage depends on your workload characteristics:

  • Stable, predictable traffic: 10-15% headroom is typically sufficient
  • Variable traffic with occasional spikes: 20-25% provides better responsiveness
  • Highly unpredictable workloads: 30%+ may be necessary to prevent performance issues

Handling Planned Traffic Increases

For known traffic events (marketing campaigns, product launches, etc.):

  1. Create an "Event Schedule" in Stackbooster.io
  2. Define the expected traffic increase and duration
  3. The platform will automatically adjust scaling parameters during the event

Monitoring Up-scaling Performance

To ensure your up-scaling is working effectively:

  1. Monitor the "Scaling Performance" dashboard
  2. Pay attention to metrics such as:
    • Average time to scale up when needed
    • Resource utilization after scaling events
    • Application performance during scaling transitions
  3. Adjust your configuration based on observed patterns

Troubleshooting

Common Up-scaling Issues

Slow Response to Traffic Spikes

If your cluster isn't scaling up quickly enough:

  • Increase the headroom percentage
  • Check for AWS service quotas limiting instance launches
  • Verify the instance types selected are readily available in your region
  • Consider enabling predictive scaling if not already active

Excessive Resources After Scaling

If scaling up results in too much unused capacity:

  • Reduce the headroom percentage
  • Decrease the maximum scale-up rate
  • Review instance selection to choose more appropriate sizes
  • Consider more granular workload-specific scaling settings

Failed Node Launches

If nodes fail to launch during scaling:

  • Check AWS service health in your region
  • Verify IAM permissions for the node instance profile
  • Review node launch templates for configuration errors
  • Ensure you haven't reached AWS account limits

Advanced Topics

Combining with Horizontal Pod Autoscaling (HPA)

Stackbooster.io works in harmony with Kubernetes HPA:

  • HPA adjusts pod counts based on metrics
  • Our platform ensures nodes are available to accommodate those pods
  • Configure both systems with compatible thresholds for optimal results

Custom Resource-Based Scaling

For specialized workloads with unique resource requirements:

  1. Define custom resource metrics in the dashboard
  2. Set thresholds for these metrics
  3. Configure scaling actions based on these thresholds

Multi-Cluster Scaling Coordination

If you operate multiple clusters with interdependencies:

  1. Use the "Cluster Groups" feature to define relationships
  2. Configure coordinated scaling policies
  3. Ensure resource allocation across the entire application landscape

By implementing these up-scaling strategies, your Kubernetes clusters will be able to handle increasing demand efficiently while maintaining optimal resource utilization and cost management.

Released under the MIT License. Contact us at [email protected]