-
Notifications
You must be signed in to change notification settings - Fork 719
feat: Apache Spark on Amazon Athena - wr.athena.create_spark_session
& wr.athena.run_spark_calculation
#2314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
) | ||
_logger.info("Calculation execution info:\n%s", response) | ||
|
||
return _get_calculation_execution_results( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the return value of an Athena Spark execution be a DataFrame? Or will the output always just be written to an S3 location?
I'm mainly just wondering if there's a way to sensibly make these API calls accept or return Data Frames.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am looking into the ways to support that where it is applicable but that is very much dependent on what spark code you are running.
There is json metadata file in the results path along with stdout/err text files, but it's almost always empty, well at least have been in my tests so far. Documentation is a bit lacking on this. I'll play around to see what we can make of it.
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
wr.athena.create_spark_session
& wr.athena.run_spark_calculation
wr.athena.create_spark_session
& wr.athena.run_spark_calculation
wr.athena.create_spark_session
& wr.athena.run_spark_calculation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, a couple of minor comments
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Feature or Bugfix
Scope
Detail
create_spark_session
andrun_spark_calculation
with corresponding waitersBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.