Up-scaling

Effective up-scaling is crucial for maintaining application performance while controlling costs. This guide explains how Stackbooster.io's intelligent up-scaling works to ensure your Kubernetes clusters have sufficient resources when demand increases.

How Stackbooster.io Up-scaling Works

Stackbooster.io uses a multi-faceted approach to determine when and how to scale up your Kubernetes clusters:

Predictive Scaling

Unlike standard Kubernetes autoscaling that reacts to resource saturation, our predictive scaling:

Analyzes historical usage patterns to anticipate needs
Identifies cyclic patterns (daily, weekly, monthly)
Preemptively adds capacity before it's critically needed
Reduces application lag caused by reactive scaling

Multi-Signal Analysis

Our platform considers multiple signals when making scaling decisions:

CPU Utilization: Beyond simple thresholds, we analyze trends and rate of change
Memory Usage: Both at node and pod levels, including projected memory growth
Pod Scheduling Events: Failed scheduling due to resource constraints
Application Metrics: Custom metrics from your applications when configured
External Factors: Scheduled events, deployments, and known traffic patterns

Smart Instance Selection

When scaling up, our algorithm selects the optimal instance types:

Analyzes the specific workload resource needs
Considers price-performance ratio across instance families
Evaluates availability of Spot instances for non-critical workloads
Makes placement decisions based on workload affinity and anti-affinity

Configuring Up-scaling

Basic Configuration

To set up basic up-scaling parameters:

Navigate to your cluster in the Stackbooster.io dashboard
Select "Scaling Configuration" > "Up-scaling"
Configure the following settings:
- Headroom Percentage: Buffer capacity to maintain (default: 15%)
- Max Scale-Up Rate: Maximum nodes to add in a single scaling action
- Predictive Scaling: Enable/disable predictive scaling
- Response Sensitivity: How quickly to respond to increased resource demands

Advanced Settings

For more granular control, you can configure:

Workload-Specific Settings

Configure different scaling parameters for various workloads:

Navigate to "Workload Settings"
Select a workload or namespace
Configure custom scaling parameters:
- Priority level for resource allocation
- Minimum guaranteed resources
- Maximum scale factor

Time-Based Rules

Create rules that modify scaling behavior based on time:

Go to "Scheduling Rules"
Create a new rule with:
- Time window (e.g., business hours, weekends)
- Custom headroom settings for the defined period
- Special handling for known high-traffic events

Custom Metrics

Integrate application-specific metrics for scaling decisions:

Navigate to "Custom Metrics"
Configure your Prometheus or CloudWatch metrics
Set thresholds and scaling factors based on these metrics

Up-scaling Strategies

Stackbooster.io offers several up-scaling strategies to choose from based on your needs:

Balanced (Default)

Maintains moderate headroom to handle normal traffic fluctuations
Scales gradually to avoid over-provisioning
Balances cost and performance considerations

Performance-Focused

Maintains higher headroom for rapid response to traffic spikes
Scales more aggressively when demand increases
Prioritizes application performance over cost optimization

Cost-Focused

Maintains minimal headroom to maximize resource efficiency
Scales more conservatively, potentially accepting some scheduling delays
Prioritizes cost savings over immediate scaling

Custom

Define your own parameters for headroom, scaling speed, and instance selection
Create different strategies for different environments or workloads
Implement special handling for specific use cases

Best Practices

Determining Optimal Headroom

The ideal headroom percentage depends on your workload characteristics:

Stable, predictable traffic: 10-15% headroom is typically sufficient
Variable traffic with occasional spikes: 20-25% provides better responsiveness
Highly unpredictable workloads: 30%+ may be necessary to prevent performance issues

Handling Planned Traffic Increases

For known traffic events (marketing campaigns, product launches, etc.):

Create an "Event Schedule" in Stackbooster.io
Define the expected traffic increase and duration
The platform will automatically adjust scaling parameters during the event

Monitoring Up-scaling Performance

To ensure your up-scaling is working effectively:

Monitor the "Scaling Performance" dashboard
Pay attention to metrics such as:
- Average time to scale up when needed
- Resource utilization after scaling events
- Application performance during scaling transitions
Adjust your configuration based on observed patterns

Troubleshooting

Common Up-scaling Issues

Slow Response to Traffic Spikes

If your cluster isn't scaling up quickly enough:

Increase the headroom percentage
Check for AWS service quotas limiting instance launches
Verify the instance types selected are readily available in your region
Consider enabling predictive scaling if not already active

Excessive Resources After Scaling

If scaling up results in too much unused capacity:

Reduce the headroom percentage
Decrease the maximum scale-up rate
Review instance selection to choose more appropriate sizes
Consider more granular workload-specific scaling settings

Failed Node Launches

If nodes fail to launch during scaling:

Check AWS service health in your region
Verify IAM permissions for the node instance profile
Review node launch templates for configuration errors
Ensure you haven't reached AWS account limits

Advanced Topics

Combining with Horizontal Pod Autoscaling (HPA)

Stackbooster.io works in harmony with Kubernetes HPA:

HPA adjusts pod counts based on metrics
Our platform ensures nodes are available to accommodate those pods
Configure both systems with compatible thresholds for optimal results

Custom Resource-Based Scaling

For specialized workloads with unique resource requirements:

Define custom resource metrics in the dashboard
Set thresholds for these metrics
Configure scaling actions based on these thresholds

Multi-Cluster Scaling Coordination

If you operate multiple clusters with interdependencies:

Use the "Cluster Groups" feature to define relationships
Configure coordinated scaling policies
Ensure resource allocation across the entire application landscape

By implementing these up-scaling strategies, your Kubernetes clusters will be able to handle increasing demand efficiently while maintaining optimal resource utilization and cost management.

Up-scaling ​

How Stackbooster.io Up-scaling Works ​

Predictive Scaling ​

Multi-Signal Analysis ​

Smart Instance Selection ​

Configuring Up-scaling ​

Basic Configuration ​

Advanced Settings ​

Workload-Specific Settings ​

Time-Based Rules ​

Custom Metrics ​

Up-scaling Strategies ​

Balanced (Default) ​

Performance-Focused ​

Cost-Focused ​

Custom ​

Best Practices ​

Determining Optimal Headroom ​

Handling Planned Traffic Increases ​

Monitoring Up-scaling Performance ​

Troubleshooting ​

Common Up-scaling Issues ​

Slow Response to Traffic Spikes ​

Excessive Resources After Scaling ​

Failed Node Launches ​

Advanced Topics ​

Combining with Horizontal Pod Autoscaling (HPA) ​

Custom Resource-Based Scaling ​

Multi-Cluster Scaling Coordination ​