A deep dive into WaiterOverrideConfiguration
✦ 2025-10-18
This post is 4ᵗʰ in a series about AWS SDK Waiters. You definitely don’t need to have read all the others before reading this one. Though perhaps “Use AWS SDK Waiters” deserves a little skim to get up to speed:
💃 This post is dynamic 💃
Each configuration value uses a simulator and timeline for visualizations.
Many of the retry behaviours depend on random values for things like jitter.
Reload this page to see how randomisation affects them.
🎮 This post includes a simulator 🎮
Scroll to the bottom of this page to use the complete simulator.
It lets you configure every parameter in WaiterOverrideConfiguration and see how it affects the retry behaviour.
Introduction
WaiterOverrideConfiguration (docs, archive) is how you configure the timeout and retry behaviour for AWS SDK Waiters on the JVM.
Each AWS service defines their own WaiterOverrideConfiguration.
These are configured per-waiter method.
That is, the waiter for a CloudFront’s DistributionDeployed is going to look quite different to the waiter for ECS’s TasksRunning.
For example, I dug into the EC2 waitUntilInstanceRunning method and pulled out the following WaiterOverrideConfiguration:
WaiterOverrideConfiguration.builder()
.maxAttempts(40)
.waitTimeout(null)
.backoffStrategyV2(
BackoffStrategy.fixedDelayWithoutJitter(
/* delay */ 15.seconds.toJavaDuration()
)
)
.build()
In the rest of this post, I dig in to what each of those configuration values means and how best to pick between them.
Just before I start, I’m going to show the behaviour of waiters using different configuration values. It helps to have a certain resource in mind while going through these examples. I won’t use any resource in particular, but maybe you could think of:
- Waiting for an SSM Command Invocation to complete
- Waiting for a CloudWatch Insights Query to complete, or
- Waiting for a Kinesis Stream to be created.
In the following examples the resource is, by default, set to reach its desired state after 10 seconds. If the waiter hasn’t timed out or reached its maximum attempt threshold, it will transition to success
maxAttempts
The maxAttempts value sets an upper bound on the number of times the waiter will check the state of the resource.
In the below example, I’ve set maxAttempts to low enough that the waiter will never finish.
Waiter Override Configuration
Limited to 10 attempts for this visualization
Must be a positive integer
How long the resource takes to reach the selected state
Timeline
After two checks of the resource, the waiter gives up.
With the rest of the configuration parameters available for a waiter, I don’t actually think its necessary to constrain an upper bound on attempts.
Doing so means you have to do some complicated maths (or use my simulator below) to figure out the right upper value to pick.
Instead, trusting waitTimeout and picking a sensible delay between API calls should be enough.
waitTimeout
The waitTimeout value configures how long you’re willing to wait for the resource to transition into the desired state.
This is probably the most important value to configure in your waiter.
Some resources, like the SSM Commands Invocations in my previous post, should have a timeout value that depends on what the command is doing. If your command includes a 5-minute sleep then your timeout should probably be configured for something a little bit more than 5 minutes. SSM has some delay in delivering the command to the target instances, so you’ll need to factor that in too.
Other resources, like an EC2 instance going into running, mostly depend on the AWS service and so can be generically defined. It’s probably fine to leave it up to whatever default AWS picked.
Waiter Override Configuration
Limited to 10 attempts for this visualization
Must be a positive integer
How long the resource takes to reach the selected state
Timeline
backoffStrategyV2
retryImmediately
The simplest backoff strategy is retryImmediately.
It’s the for loop of backoff strategies.
The waiter iterates, checking the state of the resource, over and over again, until, at some point, the resource reaches the desired state. The next check completes the waiter.
In the following example, I’ve shrunk the resource state change delay to 1 second. This keeps it within bounds for the simulator.
Waiter Override Configuration
Limited to 10 attempts for this visualization
Must be a positive integer
How long the resource takes to reach the selected state
Timeline
fixedDelayWithoutJitter
The first real “strategy” is fixedDelayWithoutJitter.
This does exactly what you’d think.
It waits for the specified delay between checks of the resource.
Waiter Override Configuration
Limited to 10 attempts for this visualization
Must be a positive integer
How long the resource takes to reach the selected state
Timeline
This seems to be the default backoff strategy in the SDKs. At least, I’ve yet to see one using something else.
fixedDelay
Fixed delay has a little secret. It includes jitter!
With jitter, fixed delay waits between 0 seconds and the delay amount before re-checking the resource.
This means that you always get at least the same number of checks of the resource as you would with fixedDelayWithoutJitter.
Often you get more.
Waiter Override Configuration
Limited to 10 attempts for this visualization
Must be a positive integer
How long the resource takes to reach the selected state
Timeline
exponentialDelayWithoutJitter
Exponential delay, unlike fixed delay, introduces more and more wait time between checks of the resource. The algorithm looks like this:
delay = min(baseDelay × 2 ^ (attempt - 1), maxDelay)
With a large enough maxDelay, and a resource that transitions state towards the second half of the waitTimeout, you can expect substantial periods of waiting.
In the following example, the last wait took 8 seconds which is more than half of the entire waiter period.
Waiter Override Configuration
Limited to 10 attempts for this visualization
Must be a positive integer
How long the resource takes to reach the selected state
Timeline
Since the algorithm picks the minimum between the baseDelay and maxDelay, you can actually reproduce the fixedDelay output by matching the two delays:
Waiter Override Configuration
Limited to 10 attempts for this visualization
Must be a positive integer
How long the resource takes to reach the selected state
Timeline
exponentialDelay
Full jitter combines the jitter implementation as seen in fixedDelay with an exponential delay.
Waiter Override Configuration
Limited to 10 attempts for this visualization
Must be a positive integer
How long the resource takes to reach the selected state
Timeline
exponentialDelayHalfJitter
And finally, half jitter gives you jitter somewhere between half the full value and the full value. This gives you something closer to the behaviour of not having jitter while still keeping the jitter safety.
Waiter Override Configuration
Limited to 10 attempts for this visualization
Must be a positive integer
How long the resource takes to reach the selected state
Timeline
Simulator
Play around with the simulator by tweaking any of its values below.
Waiter Override Configuration
Limited to 10 attempts for this visualization
Must be a positive integer
Simulation Configuration
Once the resource changes state, which state it transitions to
How long the resource takes to reach the selected state