Last modified August 29, 2023

Overview on spot instances on AWS

Platform support
AWS
  • GA in v11.2.0

Introduction

As of workload cluster release v11.2.0 for AWS, it is possible to use spot instances in clusters that will allow you to optimize your cost.

The main differences between spot and on-demand instances are that spot instances can be terminated any time by AWS. They are also more frequently unavailable.

The hourly price for a spot instance is determined by AWS through a bidding system. The resulting price varies over time and is usually much lower than the cost of the same instance type when booked as on-demand instance. To maximize the likelihood of getting a spot instance when needed, the configuration for Giant Swarm is set to bid up to the price of an on-demand instance with the same type, but not more.

There are two parameters on the node pool level that will allow you to configure which instances are going to be used:

  • On-demand base capacity: controls how much of the initial capacity is made up of on-demand instances. Note that this capacity is static and does not automatically replace any unavailable spot instances.

  • Spot instance percentage above base capacity: controls the percentage of spot instances to be used for worker nodes beyond the number of on-demand base capacity.

Notes on using spot instances

Since the availability of spot instances is volatile, there are a few things you can consider:

  • The more availability zones you cover with your node pool, the higher the likelihood that spot instances are available when required.
  • Activating the use of similar instance types also increases the likelihood of getting spot instances when using common instance types. Read more about this below.
  • When no spot instances are available, missing spot instances are not replaced by on-demand instances. The affected node pool will instead have less nodes than desired, probably leaving some pods unscheduled. For a solution to this, check our guide on using on-demand instances as fall-back when spot instances are unavailable.

Examples

The following table shows four examples to illustrate how different settings of spot instance percentage and on-demand base capacity influence the outcome.

On-demand base capacitySpot instance percentageTotal InstancesOn-Demand InstancesSpot Instances
00 %50500
0100 %50050
1050 %503020
10100 %501040