How I built that DDB TTL graph
✦ 2024-12-05
In my last post I wrote about what latency to expect when using DDB’s TTL feature. That post features a live graph showing the measured latency. I’ll post it here again:
It made the rounds on Reddit (archive) and had some good discussion. I thought I’d use this post to talk about how I built it.
The broad idea was to build a system that:
- periodically inserts an item into a DynamoDB table,
- watches for deletions by TTL and,
- compares the TTL timestamp with the actual removal time.
The difference between those two timestamps is more or less DDB’s TTL latency.
I built the app using CDK (archive). In the following sections I’ll go through each part of the system, give the CDK code I use to build it, business logic code (if any) and a peek at what the data looks like. Find a link to a GitHub repo with the complete CDK code right at the end of this post.
But first, here’s a look at the whole system:
A: EventBridge Scheduler
I wanted a constant source of work to trigger inserts into the DynamoDB table. EventBridge is a great fit here and can trigger minutely. That’s the fastest it can trigger (as of 2024-12-06) but that’s good enough for me.
const putItemRule = new Rule(this, "EventBridgeRule", {
schedule: Schedule.rate(Duration.minutes(1)),
});
See: aws-cdk-lib.aws_events module · AWS CDK
1: EventBridgeEvent
I don’t care about the contents of this event. Just that the event fires mostly on time. But here’s an example in case you find it useful:
{
"version": "0",
"id": "f76ab64d-ad69-d13c-3cff-b170e13869e0",
"detail-type": "Scheduled Event",
"source": "aws.events",
"account": "364265685121",
"time": "2024-12-05T21:27:00Z",
"region": "us-east-1",
"resources": [
"arn:aws:events:us-east-1:364265685121:rule/DdbTtlSlaStack-EventBridgeRule15224D08-99OfrtxmsKLs"
],
"detail": {}
}
B: ItemPutter Lambda Function
I chose to use CDK’s NodejsFunction
construct.
I really like using TypeScript in CDK and then TypeScript again in my Lambda functions.
It keeps everything neatly in a single repository and allows me to share type information.
NodejsFunction
nicely encapsulates building a TS-based Lambda function.
It bundles using esbuild (archive) and handles basically everything for you.
const itemPutter = new NodejsFunction(this, "ItemPutterFunction", {
entry: join(__dirname, "put-item.function.ts"),
handler: "handler",
});
putItemRule.addTarget(new LambdaFunction(itemPutter));
See: aws-cdk-lib.aws_lambda_nodejs module · AWS CDK
2: PutItem Request
This Lambda function just needs to get data into the DynamoDB table.
It is probably a good time to talk about the table schema.
It’s dead simple.
A partition key and sort key.
They’re named pk
and sk
respectively.
Then an attribute called ttl
.
The current time is placed into the pk
, sk
and ttl
.
The pk
and sk
get it in ISO 8601 format.
And ttl
gets it in seconds-since epoch.
import { DynamoDBClient, PutItemCommand } from "@aws-sdk/client-dynamodb";
import { type EventBridgeEvent } from "aws-lambda";
const ddbClient = new DynamoDBClient({
region: process.env.AWS_REGION,
});
export const handler = async (
event: EventBridgeEvent<any, any>
): Promise<void> => {
console.log("Processing EventBrideEvent:", JSON.stringify(event, null, 2));
const now = new Date();
const ttl = Math.floor(now.getTime() / 1000);
await ddbClient.send(
new PutItemCommand({
TableName: process.env.TABLE,
Item: {
pk: { S: now.toISOString() },
sk: { S: now.toISOString() },
ttl: { N: ttl.toString() },
},
})
);
};
C: DynamoDB Table
You can probably guess at the configuration of the DynamoDB Table by now.
Of course, I have to configure the timeToLiveAttribute
since that’s the point of this whole exercise.
The stream is configured here, too.
const table = new Table(this, "Table", {
billingMode: BillingMode.PAY_PER_REQUEST,
partitionKey: {
name: "pk",
type: AttributeType.STRING,
},
sortKey: {
name: "sk",
type: AttributeType.STRING,
},
timeToLiveAttribute: "ttl",
stream: StreamViewType.NEW_AND_OLD_IMAGES,
});
table.grantWriteData(itemPutter);
itemPutter.addEnvironment("TABLE", table.tableName);
Note the grantWriteData
.
Most of CDK’s L2 constructs have these grant*
methods.
They do a good job of making it easy to grant additional permissions while still keeping the policies least priveledge.
And here’s what the Lambda function puts into the table:
[
{
"pk": "2024-12-04T19:48:28.281Z",
"sk": "2024-12-04T19:48:28.281Z",
"ttl": 1733341708
},
{
"pk": "2024-12-04T19:45:28.319Z",
"sk": "2024-12-04T19:45:28.319Z",
"ttl": 1733341528
},
{
"pk": "2024-12-04T19:46:28.370Z",
"sk": "2024-12-04T19:46:28.370Z",
"ttl": 1733341588
}
]
3: DynamoDB Stream Event
As DynamoDB’s TTL feature fires, it places the removed events into the stream. Here’s a sample event:
{
"Records": [
{
"eventID": "d245400f09fe95354fca023ac597d736",
"eventName": "REMOVE",
"eventVersion": "1.1",
"eventSource": "aws:dynamodb",
"awsRegion": "us-east-1",
"dynamodb": {
"ApproximateCreationDateTime": 1732815376,
"Keys": {
"sk": {
"S": "2024-11-28T17:27:37.645631Z"
},
"pk": {
"S": "2024-11-28T17:27:37.645631Z"
}
},
"OldImage": {
"sk": {
"S": "2024-11-28T17:27:37.645631Z"
},
"pk": {
"S": "2024-11-28T17:27:37.645631Z"
},
"ttl": {
"N": "1732814857"
}
},
"SequenceNumber": "3832500000000056612526359",
"SizeBytes": 125,
"StreamViewType": "NEW_AND_OLD_IMAGES"
},
"userIdentity": {
"principalId": "dynamodb.amazonaws.com",
"type": "Service"
},
"eventSourceARN": "arn:aws:dynamodb:us-east-1:750010179392:table/DdbTtlSlaStack-DDBTTL0622B9A2-X0AOESJQCXKS/stream/2024-11-27T21:35:33.881"
}
]
}
See: Tutorial #2: Using filters to process some events with DynamoDB and Lambda - Amazon DynamoDB
D: StreamProcessor Lambda Function
Like the ItemPutter above, I used NodejsFunction
to capture the events from the stream.
const streamProcessor = new NodejsFunction(this, "StreamProcessorFunction", {
entry: join(__dirname, "process-stream.function.ts"),
handler: "handler",
});
table.grantStreamRead(streamProcessor);
streamProcessor.addEventSource(
new DynamoEventSource(table, {
startingPosition: StartingPosition.LATEST,
batchSize: 1,
retryAttempts: 1,
filters: [
FilterCriteria.filter({ eventName: FilterRule.isEqual("REMOVE") }),
],
})
);
streamProcessor.role!!.addToPrincipalPolicy(
new PolicyStatement({
effect: Effect.ALLOW,
actions: ["cloudwatch:GetMetricWidgetImage"],
resources: ["*"],
})
);
Also note that I’ve configured the DDB stream to filter to just REMOVE
events.
Then I needed to do a little setup in the Lambda function.
import { type DynamoDBStreamHandler } from "aws-lambda";
import { metricScope, Unit, StorageResolution } from "aws-embedded-metrics";
import { z } from "zod";
import { S3Client, PutObjectCommand } from "@aws-sdk/client-s3";
import {
CloudWatchClient,
GetMetricWidgetImageCommand,
} from "@aws-sdk/client-cloudwatch";
const s3Client = new S3Client({
region: process.env.AWS_REGION,
});
const cloudwatchClient = new CloudWatchClient({
region: process.env.AWS_REGION,
});
export const handler: DynamoDBStreamHandler = metricScope(
(metrics) => async (event, context) => {
console.log(
"Processing DynamoDB Stream Event:",
JSON.stringify(event, null, 2),
JSON.stringify(context, null, 2)
);
const parsedEvent = z
.object({
Records: z
.array(
z.object({
eventName: z.enum(["REMOVE"]),
dynamodb: z.object({
Keys: z.object({
pk: z.object({
S: z.string(),
}),
}),
}),
})
)
.length(1),
})
.parse(event);
const record = parsedEvent.Records[0]!!;
// The rest goes here
}
);
Importantly, I’ve used Zod to make sure the event I’m handling looks the way I expect it to. Read more about how you shouldn’t trust runtime input in TypeScript in my post How I use io-ts to guarantee runtime type safety in my TypeScript.
You may also notice that my function is wrapped in a metricScope
function call.
That makes emitting metrics from Lambda a breeze.
More on that later.
See: Zod Documentation
4: PutMetric by EMF
Emitting metrics is faily simple in Lambda these days. Embedded Metrics Format (EMF) makes it as easy as logging.
const differenceFromNow =
new Date().getTime() - new Date(record.dynamodb.Keys.pk.S).getTime();
metrics.putMetric(
"ttl-latency",
differenceFromNow,
Unit.Milliseconds,
StorageResolution.Standard
);
The metrics
variable there is the same one vended by
See: awslabs\/aws-embedded-metrics-node: Amazon CloudWatch Embedded Metric Format Client Library
E: CloudWatch Metrics
There’s not much to say here. Emitting the metrics from the function handles all of the “resource creation” on the CloudWatch side. So there’s nothing else I needed to set up.
5: GetMetricWidgetImage
const accountId = parseArn(context.invokedFunctionArn).accountId;
const getLatencyMetricWidgetImageOutput = await cloudwatchClient.send(
new GetMetricWidgetImageCommand({
MetricWidget: JSON.stringify({
metrics: [
[
{
expression: "(m1 / 1000) / 60",
label: "Time elapsed between TTL and removal",
id: "e1",
region: environmentVariables.AWS_REGION,
accountId,
},
],
[
"aws-embedded-metrics",
"ttl-latency",
"LogGroup",
context.functionName,
"ServiceName",
context.functionName,
"ServiceType",
"AWS::Lambda::Function",
{
id: "m1",
visible: false,
period: 300,
region: environmentVariables.AWS_REGION,
accountId,
},
],
],
sparkline: false,
view: "timeSeries",
stacked: false,
region: environmentVariables.AWS_REGION,
stat: "Maximum",
period: 60,
start: "-PT24H",
yAxis: {
left: {
min: 0,
showUnits: false,
label: "Minutes",
},
},
liveData: false,
setPeriodToTimeRange: true,
title: "DynamoDB TTL Latency",
width: 768,
height: 384,
theme: "dark",
}),
OutputFormat: "png",
})
);
These can be a bit hairy to construct. I prefer manually designing the graphs in the CloudWatch console and then copying the source out into my codebase. I did that here and then replaced hardcoded account IDs, regions and function names.
F: S3 Bucket
const bucket = new Bucket(this, "Bucket", {
publicReadAccess: true,
blockPublicAccess: {
blockPublicPolicy: false,
blockPublicAcls: false,
ignorePublicAcls: false,
restrictPublicBuckets: false,
},
enforceSSL: true,
});
bucket.grantReadWrite(streamProcessor);
streamProcessor.addEnvironment("BUCKET", bucket.bucketName);
The bucket needs to allow public read access. That’s what makes the graph above viewable.
I also needed to pass the name of the bucket into the Lambda Function.
6: PutObject Request
await s3Client.send(
new PutObjectCommand({
Bucket: process.env.BUCKET,
Key: "ttl-latency.png",
Body: getLatencyMetricWidgetImageOutput.MetricWidgetImage,
ContentEncoding: "base64",
ContentType: "image/png",
})
);
It took me a little time to work out that I needed to set the content encoding to base64. But that made the upload stick.
Thanks for taking the time to go through that. If you’d like a more cohesive view, the complete codebase is available on GitHub at KieranHunt/ddb-ttl.