DEV Community

Joseph Hoppe 🇺🇦
Joseph Hoppe 🇺🇦

Posted on

AWS — Properly delete messages between Lambdas and SQS Queues

AWS - Properly delete messages between Lambdas and SQS Queues
A lot of people may already know this, but posting for some who need clarity after sifting through the conflicting information that I found on the net.

Quick Summary (TLDR)

When using SQS Event Triggers, the default behavior is that SQS messages will be automatically deleted. There are certain conditions that also need to be met.

Prior to the release of SQS Event Triggers in 2018, the opposite was true - messages used to need to be deleted explicitly, or the message would become visible again after the visibility timeout.

Back then, a common pattern would be to poll an SQS queue for messages, possibly using a chron lambda trigger. The SQS message would become invisible to other consumers, and when successfully processed, the lambda would need to explicitly delete the message from the queue.

Because of this paradigm shift in 2018, there is information on the internet stating that lambdas "automatically delete the message after processing", and also information that they "do need to manually delete SQS messages", without clearly specifying. The truth is that it depends on the lambda trigger.

SQS Events are the de factor trigger for many use cases. To reiterate, for this popular use case today, if specific conditions are met, the lambda will automatically delete the messages on completion.

The Backstory (2017 and earlier)

Way back when, developers had to manually wire up the code to poll an SQS queue. One approach would be to:

  1. Have a lambda execute on a chron schedule. Use the ReceiveMessage API to manually poll and retrieve messages.
  2. After processing the message, the lambda would then be responsible for explicitly deleting the message from the SQS queue using the DeleteMessage API.

If the message was not deleted, and the message visibility timeout elapses, the message would become visible to other consumers.

SQS Event Triggers are released in 2018

AWS greatly simplified the process by adding a new event source to streamline this process in 2018. The new process became:

  1. Create an Event Source Mapping (ESM) connecting the Lambda function to the SQS queue. This triggers the lambda when messages are received, and pushes the messages to the queue as the SQS message body.
  2. The lambda does not need to explicitly delete the message from the queue.

No code is needed to explicitly poll the queue or delete the message, and can focus on the core processing logic.

More on SQS Event Triggers

If I do not need to explicitly delete the SQS messages, why are messages being routed to the Dead Letter Queue (DLQ)?

If the lambda throws an uncaught error, the message(s) in the batch will be retried after the message visibility timeout. If the maxReceiveCount is met, the message will be routed to DLQ.

*But how do I retry a subset of messages that came in a batch?
*

First, ReportBatchItemFailures must be enabled on the lambda trigger itself.

If a message should not be deleted, the lambda must return an SQSBatchResponse object (if using TypeScript) containing an array of message ids that should not be deleted. The return type is below, with itemIdentifier being the messageId:
export interface SQSBatchResponse {
batchItemFailures: SQSBatchItemFailure[];
}
export interface SQSBatchItemFailure {
itemIdentifier: string;
}

A few more nuances to be aware of (from the AWS Documentation):

Success and failure conditions
Lambda treats a batch as a complete success if your function returns any of the following:

  • An empty batchItemFailures list
  • A null batchItemFailures list
  • An empty EventResponse
  • A null EventResponse

Lambda treats a batch as a complete failure if your function returns any of the following:

  • An invalid JSON response
  • An empty string itemIdentifier
  • A null itemIdentifier
  • An itemIdentifier with a bad key name
  • An itemIdentifier value with a message ID that doesn't exist

For more information, see the AWS Documentation on Reporting partial batch responses, which is a subsection of the page: Using Lambda with Amazon SQS.

I originally posted this article on Medium.

Top comments (0)