Up-scaling
Effective up-scaling is crucial for maintaining application performance while controlling costs. This guide explains how Stackbooster.io's intelligent up-scaling works to ensure your Kubernetes clusters have sufficient resources when demand increases.
How Stackbooster.io Up-scaling Works
Stackbooster.io uses a multi-faceted approach to determine when and how to scale up your Kubernetes clusters:
Predictive Scaling
Unlike standard Kubernetes autoscaling that reacts to resource saturation, our predictive scaling:
- Analyzes historical usage patterns to anticipate needs
- Identifies cyclic patterns (daily, weekly, monthly)
- Preemptively adds capacity before it's critically needed
- Reduces application lag caused by reactive scaling
Multi-Signal Analysis
Our platform considers multiple signals when making scaling decisions:
- CPU Utilization: Beyond simple thresholds, we analyze trends and rate of change
- Memory Usage: Both at node and pod levels, including projected memory growth
- Pod Scheduling Events: Failed scheduling due to resource constraints
- Application Metrics: Custom metrics from your applications when configured
- External Factors: Scheduled events, deployments, and known traffic patterns
Smart Instance Selection
When scaling up, our algorithm selects the optimal instance types:
- Analyzes the specific workload resource needs
- Considers price-performance ratio across instance families
- Evaluates availability of Spot instances for non-critical workloads
- Makes placement decisions based on workload affinity and anti-affinity
Configuring Up-scaling
Basic Configuration
To set up basic up-scaling parameters:
- Navigate to your cluster in the Stackbooster.io dashboard
- Select "Scaling Configuration" > "Up-scaling"
- Configure the following settings:
- Headroom Percentage: Buffer capacity to maintain (default: 15%)
- Max Scale-Up Rate: Maximum nodes to add in a single scaling action
- Predictive Scaling: Enable/disable predictive scaling
- Response Sensitivity: How quickly to respond to increased resource demands
Advanced Settings
For more granular control, you can configure:
Workload-Specific Settings
Configure different scaling parameters for various workloads:
- Navigate to "Workload Settings"
- Select a workload or namespace
- Configure custom scaling parameters:
- Priority level for resource allocation
- Minimum guaranteed resources
- Maximum scale factor
Time-Based Rules
Create rules that modify scaling behavior based on time:
- Go to "Scheduling Rules"
- Create a new rule with:
- Time window (e.g., business hours, weekends)
- Custom headroom settings for the defined period
- Special handling for known high-traffic events
Custom Metrics
Integrate application-specific metrics for scaling decisions:
- Navigate to "Custom Metrics"
- Configure your Prometheus or CloudWatch metrics
- Set thresholds and scaling factors based on these metrics
Up-scaling Strategies
Stackbooster.io offers several up-scaling strategies to choose from based on your needs:
Balanced (Default)
- Maintains moderate headroom to handle normal traffic fluctuations
- Scales gradually to avoid over-provisioning
- Balances cost and performance considerations
Performance-Focused
- Maintains higher headroom for rapid response to traffic spikes
- Scales more aggressively when demand increases
- Prioritizes application performance over cost optimization
Cost-Focused
- Maintains minimal headroom to maximize resource efficiency
- Scales more conservatively, potentially accepting some scheduling delays
- Prioritizes cost savings over immediate scaling
Custom
- Define your own parameters for headroom, scaling speed, and instance selection
- Create different strategies for different environments or workloads
- Implement special handling for specific use cases
Best Practices
Determining Optimal Headroom
The ideal headroom percentage depends on your workload characteristics:
- Stable, predictable traffic: 10-15% headroom is typically sufficient
- Variable traffic with occasional spikes: 20-25% provides better responsiveness
- Highly unpredictable workloads: 30%+ may be necessary to prevent performance issues
Handling Planned Traffic Increases
For known traffic events (marketing campaigns, product launches, etc.):
- Create an "Event Schedule" in Stackbooster.io
- Define the expected traffic increase and duration
- The platform will automatically adjust scaling parameters during the event
Monitoring Up-scaling Performance
To ensure your up-scaling is working effectively:
- Monitor the "Scaling Performance" dashboard
- Pay attention to metrics such as:
- Average time to scale up when needed
- Resource utilization after scaling events
- Application performance during scaling transitions
- Adjust your configuration based on observed patterns
Troubleshooting
Common Up-scaling Issues
Slow Response to Traffic Spikes
If your cluster isn't scaling up quickly enough:
- Increase the headroom percentage
- Check for AWS service quotas limiting instance launches
- Verify the instance types selected are readily available in your region
- Consider enabling predictive scaling if not already active
Excessive Resources After Scaling
If scaling up results in too much unused capacity:
- Reduce the headroom percentage
- Decrease the maximum scale-up rate
- Review instance selection to choose more appropriate sizes
- Consider more granular workload-specific scaling settings
Failed Node Launches
If nodes fail to launch during scaling:
- Check AWS service health in your region
- Verify IAM permissions for the node instance profile
- Review node launch templates for configuration errors
- Ensure you haven't reached AWS account limits
Advanced Topics
Combining with Horizontal Pod Autoscaling (HPA)
Stackbooster.io works in harmony with Kubernetes HPA:
- HPA adjusts pod counts based on metrics
- Our platform ensures nodes are available to accommodate those pods
- Configure both systems with compatible thresholds for optimal results
Custom Resource-Based Scaling
For specialized workloads with unique resource requirements:
- Define custom resource metrics in the dashboard
- Set thresholds for these metrics
- Configure scaling actions based on these thresholds
Multi-Cluster Scaling Coordination
If you operate multiple clusters with interdependencies:
- Use the "Cluster Groups" feature to define relationships
- Configure coordinated scaling policies
- Ensure resource allocation across the entire application landscape
By implementing these up-scaling strategies, your Kubernetes clusters will be able to handle increasing demand efficiently while maintaining optimal resource utilization and cost management.
