feat(event-handler): add appsync batch invoke
Issue Number: https://github.com/awslabs/aws-lambda-powertools-python/issues/1303
Summary
Support Appsync batching mechanism for Lambda Resolvers
Changes
- Modify Appsync BaseRouter to accept List of AppSyncResolverEventT events alongside singular event
- Store List of AppSyncResolverEventT events in BaseRouter.current_event
- Add Functional test covering batch processing
- Add aws-cdk-aws-appsync-alpha cdk package
- Add end2end tests that
- deploys AppSync API with lambda resolver and specific schema
- Validates different resolvers including batching mechanism
User experience
If batch processing is used for specific method, user can access AppSyncResolver events directly in their annotated method and do appropriate processing. Appsync resolver calls method for every event from the coming list of events:
from typing import List, Optional
from pydantic import BaseModel
from aws_lambda_powertools.event_handler import AppSyncResolver
from aws_lambda_powertools.utilities.data_classes import AppSyncResolverEvent
from aws_lambda_powertools.utilities.typing import LambdaContext
app = AppSyncResolver()
posts = {
"1": {
"post_id": "1",
"title": "First book",
"author": "Author1",
"url": "https://amazon.com/",
"content": "SAMPLE TEXT AUTHOR 1",
"ups": "100",
"downs": "10",
},
"2": {
"post_id": "2",
"title": "Second book",
"author": "Author2",
"url": "https://amazon.com",
"content": "SAMPLE TEXT AUTHOR 2",
"ups": "100",
"downs": "10",
},
"3": {
"post_id": "3",
"title": "Third book",
"author": "Author3",
"url": None,
"content": None,
"ups": None,
"downs": None,
}
}
posts_related = {
"1": [posts["2"]],
"2": [posts["3"], posts["1"]],
"3": [posts["2"], posts["1"]],
}
class Post(BaseModel):
post_id: str
author: str
title: str
url: str
content: str
ups: str
downs: str
@app.batch_resolver(type_name="Post", field_name="relatedPosts")
def related_posts(event: AppSyncResolverEvent) -> Optional[list]:
return posts_related[event.source["post_id"]] if event.source else None
def lambda_handler(event, context: LambdaContext) -> dict:
return app.resolve(event, context)
Under the hood, Router:
- Recognize we got a list of events so it assumes request coming from appsync batch operations.
- It registers methods that we want to use to handle batch events. We use for it brand new resolver called
batch_resolver. - Router finds this method bases on
type_nameandfield_nameand calls it for every event in the list. It passes the event itself to it as parameter argument. Optionally it pass user specified parameters if they are present - they are taken from the inside of the event itself. We pass user specified parameters to keep similar UX as with baseresolver. - Once all events are processed by decorated method, router combines the output from multiple method calls in a form of a list and send it back to the appsync.
Checklist
If your change doesn't seem to apply, please leave them unchecked.
- [x] Meet tenets criteria
- [x] I have performed a self-review of this change
- [x] Changes have been tested
- [x] Changes are documented
- [x] PR title follows conventional commit semantics
Is this a breaking change?
RFC issue number:
Checklist:
- [ ] Migration process documented
- [ ] Implement warnings (if it can live side by side)
Acknowledgment
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.
Codecov Report
Attention: Patch coverage is 99.25373% with 1 line in your changes missing coverage. Please review.
Project coverage is 96.45%. Comparing base (
e14e768) to head (a612e38). Report is 657 commits behind head on develop.
| Files | Patch % | Lines |
|---|---|---|
| aws_lambda_powertools/event_handler/appsync.py | 98.78% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## develop #1998 +/- ##
===========================================
+ Coverage 96.38% 96.45% +0.07%
===========================================
Files 214 223 +9
Lines 10030 10723 +693
Branches 1846 1995 +149
===========================================
+ Hits 9667 10343 +676
- Misses 259 268 +9
- Partials 104 112 +8
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Thank you so much for the change, that's much better! Made one comment about favouring composition instead of multiple inheritance (it'll bite us!).
One super helpful change would be to update PR description with a sample UX, so that customers could easily spot what's coming, we could reuse in our docs, and release notes :-)
I'll do a proper review next week
@heitorlessa Did further refactor to address things we discussed (UX + design). Let me know if that's ok
Hello @mploski! I'm still working on validating this PR and creating tests with as many variations as we have to make sure it's ok. So far I haven't found any problems. We're releasing a release today to fix an emergency bug we have in Codepipeline's EventSource, but I hope to include this PR in the next PR.
Thanks as always for putting in the effort to make this project even better. :rocket:
Hi @mploski!!
I tested the code in different ways and with different resolvers and fields and everything worked exactly as expected. I spent a few days doing exploratory tests and found nothing that failed. I also ran the end2end tests and checked the features and everything was as expected.
My questions/next steps at this point are:
1 - Whether we will have a problem in the end2end test when running the different versions of Python in parallel. I haven't used the aws-cdk.aws-appsync-alpha constructor yet, and I don't know if there are fields that need to be unique across stacks. I know we may have two APIs in AppSync with the same name, but we need to check.
2 - For me the code makes sense and we don't have a break in the current functionality, but I'd like a double-check here @heitorlessa! Just to make sure I'm not missing anything.
3 - We need to adjust the documentation to explain that we now support batch calling and how the user can use it.
But again, great job @mploski. Every PR I interact with you, I learn a lot of new things and improve my skills! Thank you for your patience and we will merge this PR soon :)
Based on my understanding of the User Experience description of this PR, I am not fond of using AppSyncResolver to iteratively call my python code for every event passed in the batch request. In the proposed setup, I no longer see the benefits of using AppSync Batch Resolvers. Please correct me if I am mistaken.
To illustrate, the Lambda function related_posts() in the provided example uses the "post_id" field from the posts_related dictionary for each event passed in AppSyncResolverEvent. In real-world scenarios, the in-memory lookup of posts_related dictionary is substituted with a database call. We use GraphQL to batch these database calls, particularly if the "post_id" field is expected to be repeated across multiple resolver calls. While the caching ability of AppSync is helpful, the TTL is not very effective (and expensive), and the initial call to populate the AppSync cache can be very slow.
As a user of AppSync Batch Resolvers, I would like to have more control over how my lambda resolvers group responses to Batch Events. This allows us to solve the GraphQL n+1 problem
The example provided shows how we use Lambda functions to respond to AppSync Batch requests, with the posts_batch_resolver() function being a key part of the setup.
@router.resolver(type_name="Post", field_name="relatedPosts")
@cdk_gql.appsync_batch(2000)
@tracer.capture_method
def posts_batch_resolver() -> List[List[Optional[Post]]]:
# Extract IDs from each request in Batch Call. This will be used for the DB Batch call
post_ids = batch_util.get_required_ids(event_list=router.current_event.event_list, fieldname="post_id")
# Remove duplicate IDs across all events. This reduces redundant lookups for the DB call
unique_ids = list(set(post_ids))
# Make single DB call to retrieve data for all IDs
unique_dict = dao.get_relatedposts_from_id_list(id_list= unique_ids)
# Build response
to_return = []
for _id in post_ids:
to_return.append({'data':unique_dict[_id] if _id in unique_dict else None})
return to_return
This is the base Lambda function that sets up the Router and resolvers the AppSync Request..
@logger.inject_lambda_context(correlation_id_path=correlation_paths.APPSYNC_RESOLVER)
@tracer.capture_lambda_handler
def lambda_handler(event: dict, context: LambdaContext) -> dict:
""" This is the base Lambda handler for all GraphQL reoslvers. """
if isinstance(event, list):
return app.resolve(event, context, data_model=batch_util.BatchRequest)
return app.resolve(event, context)
This is the batch_util file
from typing import List
from aws_lambda_powertools.utilities.data_classes.appsync_resolver_event import (
AppSyncResolverEvent,
)
class BatchRequest(AppSyncResolverEvent):
def __init__(self, event_list):
super().__init__(event_list[0])
self._event_list : List[AppSyncResolverEvent] = []
for event in event_list:
self._event_list.append(AppSyncResolverEvent(event))
@property
def event_list(self) -> dict:
return self._event_list
def get_required_ids(event_list: List[AppSyncResolverEvent], fieldname: str) -> List[str]:
ids = []
for event in event_list:
ids.append(event.source[fieldname])
return ids
Please let me know if this makes sense.
Btw @cdk_gql.appsync_batch(2000) is a custom decorator we use to auto-generate aws-cdk-lib/aws-appsync CDK code for creating resolvers. I would love to discuss this use case in a separate ticket.
Thanks for the feedback to improve the user experience on this PR @thenamanpatwari. I believe that a better way to resolve this would be to schedule a meeting to discuss this point and understand how to implement this feature.
Do you have an agenda for a meeting this coming week? @mploski @heitorlessa @thenamanpatwari.
Thank you
I'd love to connect and figure this out. I am available either on May 11 or May 18 around noon CST
Thanks for the feedback to improve the user experience on this PR @thenamanpatwari. I believe that a better way to resolve this would be to schedule a meeting to discuss this point and understand how to implement this feature.
Do you have an agenda for a meeting this coming week? @mploski @heitorlessa @thenamanpatwari.
Thank you
Happy to connect and discuss it in details. Im on PTO 11-18 May. Im ready to jump in the call before or after.
Let us talk after your PTO on May 22. Does noon CST work? Can you please setup a Chime room? I am not really sure how I should share my email address with you.
Naman - could you send an email to aws-lambda-powertools-feedback at amazon dot com?
We can then reply with a link with multiple slots you can choose to make this easier.
Apologies for the back and forth on this
On Mon, 8 May 2023 at 23:25, Naman Patwari @.***> wrote:
Let us talk after your PTO on May 22. Does noon CST work? Can you please setup a Chime room? I am not really sure how I should share my email address with you.
— Reply to this email directly, view it on GitHub https://github.com/awslabs/aws-lambda-powertools-python/pull/1998#issuecomment-1539015298, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZPQBES2FURGZEU4QBEZDLXFFQDTANCNFSM6AAAAAAVW7KZBE . You are receiving this because you were mentioned.Message ID: @.***>
Email sent. No issues on the back-and-forth, I am just glad this is moving forward. I want to maintain as little code as possible haha.
Hi @thenamanpatwari! Thank you so much for meeting and for providing useful information and data points to help us deliver a better experience for customers. For now, it's our homework to review all the details you've given us and think about how to take the next steps. Probably next week we will have news about this and we will update here so we can decide the next steps in this PR.
Thanks for supporting Powertools and taking the time to tell us about your experience.
Very nice stuff! We're going to begin using this library to speed up our Lambda work.
I implemented my own appsync-router package a few years ago. Some take aways from the batch processing portion (we use batch invocation quite heavily):
- For most batch invocations, as the datasource developer I don't (and shouldn't) care that It is a batch invocation. The resolver/router library should handle this for me.
- I should be able to control whether the methods fulfilling the invocations are run concurrently or sequentially (my stuff only does this at the router level, an improvement would be at the field level). Sequentially can be pretty bad for large batches that require individual DB or HTTP resolution.
- For some batch invocations, the ability to deliver the whole batch at once (as detailed above) is really nice. That was not part of what I built, but I can definitely see a use, especially if you can make one data access call to resolve the whole batch.
I'm thinking about using your resolver in going-forward projects. Also thinking about contributing, if you'll have me. We do a LOT of Lambda development in Python (actually, most of our work).
Just had a call with @mploski and @leandrodamascena ... @leandrodamascena will comment with an actionable plan here soon.
After reading this conversation and learning what Michal and Leandro discussed with @thenamanpatwari, there's another way to improve this with Partial failure + all-at-once processing (Asyncio) while giving power users more flexibility to prevent a N+1.
Hello everyone. The next steps in this PR are as follows:
1 - We will enhance the user experience by handling partial failures in batch processing. This improvement will simplify the handling of failed records and reduce the need for manual implementation of error-handling records.
2 - We will introduce the ability to process records async, similar to BatchProcessor utility. For batch resolvers defined as async functions, we'll call with all items at the same time (asyncio.gather), since the intent is clear it can be processed concurrently safely.
@mploski we need to work to add these features and create the documentation.
@thenamanpatwari, your use case is highly valuable to us, and we don't want to miss the opportunity to implement this advanced feature for batch processing with AppSync. We kindly request you open a Feature Request issue so that we can prioritize and support this feature. If possible, please consider submitting a PR with the code for us to merge into the repository. It is very important for us to have contribs/PR from our users.
Thank you all for your patience. Indeed, this implementation has involved complex discussions and considerations.
Hey @leandrodamascena
Thank you for the update. I have been playing around with SQS Batch Processing feature of Lambda Power Tools and I understand the reasoning behind your use-case. I think there are situations when this feature would be useful for AppSync. However this feature does not solve the GraphQL n+1 problem.
I accept your reasoning of implementing the advanced "batch" batch processing feature as an add-on after your initial implementation for batch processing. I will wait for this PR to get merged and then will either raise a new ticket or consider doing a PR myself along with the help of @joseph-wortmann
Sounds like a plan!
@mploski @leandrodamascena whats the latest here?
What can I do to get this over to the finish line?
We have identified some really cool use-cases for this feature. Eagerly looking forward to its successful implementation.
Hey everyone! I had a meeting with @mploski to complete the changes to this PR and we agree to add the Partial Error Handling to a separate PR. The reason we decided this is because returning the error to AppSync requires the customer to change the mapping template to handle the error and it's done.
We'll split the PRs to make code review easier and allow us to create consistents examples of this. This PR has been open for a long time and adding the Partial Error Handling now would take much longer to merge.
Missing in this PR:
- [ ] Documentation
Kudos, SonarCloud Quality Gate passed! 
0 Bugs
0 Vulnerabilities
0 Security Hotspots
5 Code Smells
No Coverage information
0.4% Duplication
Kudos, SonarCloud Quality Gate passed! 
0 Bugs
0 Vulnerabilities
0 Security Hotspots
6 Code Smells
No Coverage information
0.3% Duplication
Kudos, SonarCloud Quality Gate passed! 
0 Bugs
0 Vulnerabilities
0 Security Hotspots
5 Code Smells
No Coverage information
2.8% Duplication
Hey team, any updates? Building a feature currently where this capability would be insanely helpful
Hey team, any updates? Building a feature currently where this capability would be insanely helpful
Hi @thenamanpatwari thank you for the ping! We are currently working on two major features that will be released in the beginning of January, and immediately after that we'll look at finalising this one. So our current target to release this feature is the end of January.
Hey everyone! I appreciate your patience in waiting for this new feature. I'm working on refactoring some parts of this code and we hope to release it in the next version scheduled for February 2nd.
In case of any update, I'll let you know.
Hey @rubenfonseca! This PR is ready for review!
Thanks
Hey, everyone! We will have an internal meeting with the AppSync Service team to discuss error handling when consuming an AppSync batch process. We want to hear from them about use cases and what challenges customers are facing.
After this meeting, I will update this PR. The expectation of having this feature in the next release remains, but I will update this PR if we have any additional news from this meeting.
The meeting with the AppSync team has been rescheduled for Friday the 2nd and we will not have time to include this feature in this release. We plan to include it in the other release in 1 or 2 weeks.
Hey @heitorlessa! Thanks for your suggestion about implementing fine-grained exception control per route. It makes a lot more sense and add flexibility when handling exceptions in specific routes.
I made the changes and pls you can review now.
Thank you Le! self-assigning and looking into the afternoon to get the final doc touches.