Description
Key information
- RFC PR: n/a
- Related issue(s), if known: n/a
- Area: Batch
- Meet tenets: Yes
Summary
A new generic batch processing utility, which can process records from SQS, Kinesis Data Streams, and DynamoDB streams, and handle reporting batch failures.
Motivation
With the launch of support for partial batch responses for Lambda/SQS, the event source mapping can now natively handle partial failures in a batch - removing the need for calls to the delete api. This support already exists for Kinesis and DynamoDB streams.
The Java SDK for Lambda contains 1/ the incoming message types (both batch and nested messages within batch) and 2/ the partial batch response types. The documentation for each event source (e.g. SQS) contains examples for implementing partial batch responses. Powertools aims to improve on this by:
- Providing a consistent experience across event sources - the user should 1/ become aware that other message sources can be handled in the same fashion, and not 2/ not have to rework their logic to use a different message source
- Benefit automatically from all best practices here. For instance, the example SQS code linked above does not deal with FIFO queues, and this it is not immediately apparent that extra logic is needed. Powertools will transparently handle this case
- Integrate without surprises with adjacent features, such as large message handling
Proposal
1. Layout
The new utility will be implemented in a new package powertools-batch
. The existing SQS batch processor powertools-sqs
will be maintained for bug fixes and removed in powertools for java v2.
2. Existing SQS Batch Interface Simplifications
Powertools for Java has an existing batch processing mechanism for SQS only, which was written before partial responses and uses explicit message deletion instead (documentation, code).
It includes extra tunables that are not present in the python implementation and make less sense with partial batch responses. We will not support these:
- Non-retryable exceptions - there is no mechanism to indicate in a partial batch response that a particular message should not be retried and instead moved to DLQ - a message either succeeds, or fails and is retried.
- Supress exception - stops the processor from throwing an exception on failure. Because the old processor explicitly handles batch failures by deletion, it can throw once its done. With the new processor, because failure of a message returns a result explicitly to Lambda to allow lambda to remove the message, this is unnecessary
The existing implementation in addition to a utility class provides an annotation to handle batch responses. It works like this:
@Override
@SqsBatch(value = SampleMessageHandler.class, suppressException = true)
public String handleRequest(SQSEvent input, Context context) {
return "{\"statusCode\": 200}";
}
This doesn't make sense in the world of partial batch responses. The message returned to Lambda will be the partial batch response itself, and will therefore be generated by the new batch utility. That means that this still of embedding a success message in the center makes no sense, as the user's own code does not control the return.
3. Features to retain
- Idempotency utility integration
- Large message support. This has been historically implemented by the
@SqsLargeMessage
annotation on the lambda request handler itself. However in Feature request: Better failures handling while using both BatchProcessing and LargeMessageHandling #596 we can see that the current implementation is not optimal when batches and large messages are combined, leading to an entire batch failure when a single message has issues. We will resolve this here by incorporating the large message processing into the inner-loop of the batch processing, rather than an aspect that happens before the batch. We will see if this can be done automatically or if it will require the user to hint they need large message processing enabled when providing their batch handler. - FIFO queue support. The responses for FIFO queues are different - you must stop processing the batch as soon as any failures appear. We have an open PR for this against our existing SQS impl, and should implement the same here fix: Handle batch failures in FIFO queues correctly #1183
4. User-Facing API
To decide on the approach, let's look at 2 alternative implementations of the user-facing API. The complete
implementation has not been built, just enough of the API of the new library to complete the RFC phase.
We can provide a simple builder interface.
This decouples us completely from the request handler model which may provide extra flexibility in the future, and
gives us a mechanism to extend behaviour later without breaking interfaces by adding extra parameters to the builder.
By way of example, success and failure hooks are added to SQS, a feature also provided in the Python implementation.
A partial mockup of this implementation can be found here.
public class SqsExampleWithIdempotency implements RequestHandler<SQSEvent, SqsBatchResponse> {
private SqsBatchMessageHandler handler;
public SqsExampleWithIdempotency() {
// Example 1 - process a raw SQS message in an idempotent fashion
// return ...
batchHandler = new BatchMessageHandlerBuilder()
// First we can set parameters common to all message sources
.withFailureHandler((msg, e) -> System.out.println("Whoops: " + msg.getMessageId()))
.withSqsBatchHandler()
// now we can set any parameters specific to SQS
.withRawMessageHandler(this::processRawMessage)
//.withDeserializedMessageHandler<Basket>(this::processDeserializedMessage)
.build();
}
@Override
public SqsBatchResponse handleRequest(SQSEvent sqsEvent, Context context) {
return handler.process(sqsEvent);
}
@SqsLargeMessage
private void processRawMessage(SQSEvent.SQSMessage sqsMessage, Context context) {
}
@Idempotent
private void processDeserializedMessage(@IdempotencyKey Basket basket, Context context) {
// Do some stuff
}
}
4.3 RequestHandler
base class
A third option was considered providing a base BatchRequestHandler<...>
for the user to extend.
This option was discarded because it limits flexibility of the user's code. The code for this
variant can nonetheless be found
here.
5. Data Model
The new module will consume batch events from Lambda using the types in aws-lambda-java-events. From there individual records must be pulled out and passed onto the user-provided handler.
The events library already has nested types for the messages within a batch; we simply pass these through to the user's handler. These types do not share a ABC so each handler is coupled to the concrete type of the source that is producing messages.
This approach decreases the complexity of the powertools implementation - no additional mapping needs to be done - and would also automatically pass through new fields appearing in the interface with a simple dependency version update.
Questions
- Should we wait for the messages v4 API, and depend on this, while the old version of the SQS processor continues to use the old messaging library?
- Do we intend for this module to supersede
powertools-sqs
completely? If so, we'd need to duplicate some other functionality - e.g. the large message aspect - so that users do not need to pull both libraries in.
Drawbacks
This change will introduce redundancy between the existing SQS batch processing utility and this
new utility. The old utility will be removed as part of the v2 changes.
This utility adds no additional dependencies. The message types involved are all bundled together
in aws-lambda-java-events
.
Rationale and alternatives
- Provide an abstract-base-
RequestHandler
that can be extended by the user (example code). This has been discarded as previous feedback has indicated that in some cases, it is not practical to extend a Powertools class in theRequestHandler
- using default interfaces allows us to mixin implementation without doing this - Use annotations and aspects to inject behaviour - this has been discarded because 1/ it adds unnecessary complexity in this case and 2/ is not ergonomic when the return type of the function must be governed by the message processor
- Use interface defaults to mix behaviour into request handler - this has been discarded because 1/ it couples us to the
RequestHandler
still to some extent and 2/ it isn't a common pattern in Java
Unresolved questions
Optional, stash area for topics that need further development e.g. TBD
Metadata
Metadata
Assignees
Type
Projects
Status