-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Performance improvement for partitions with a large number of steps [BATCH-2716] #891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Krishna Bhamidipati commented This is my first time creating a bug fix patch for Spring Batch, so any advice and guidance would be most appreciated! |
Mahmoud Ben Hassine commented Krishna Bhamidipati That's aweswome! Thank you for this analysis and benchmarks! This issue has been planned for v4.2 for which the theme is "performance ++". I checked the content of the attached zip file and there are some good ideas in there. Can you please open a PR on Github with those changes? Many thanks upfront. |
Krishna Bhamidipati commented Thanks for reviewing the code Mahmoud Ben Hassine! I don't yet see a branch for 4.2.x on GitHub yet (only up to 4.1.x). Should I wait to submit the PR until the branch is created? |
Mahmoud Ben Hassine commented 4.2.x is on the Thank you upfront! |
Mahmoud Ben Hassine commented Krishna Bhamidipati We are planning to tackle this issue in the upcoming 4.2.0.RC1. Have you got a chance to open a PR with the changes in the zip file against the master branch? Looking forward to your feedback. |
Krishna Bhamidipati commented Mahmoud Ben Hassine oh that's great! Oops, forgot to complete my PR, but let me do that now. |
Krishna Bhamidipati commented Mahmoud Ben Hassine just opened https://github.com/spring-projects/spring-batch/pull/716 :) |
Mahmoud Ben Hassine commented Resolved with 62a8f44 |
Krishna Bhamidipati opened BATCH-2716 and commented
Each time partitions are created,
SimpleStepExecutionSplitter#split
queries the Spring Batch tables to double check if the partitioned step is a restart. This originally scanned the entire table for each time.The patched approach uses a new method on
JobRepository
which allowsSimpleStepExecutionSplitter
to retreive all the step executions at the time of partitioning. By assuming there are no other threads creating conflicting partitions for the same job instance, the splitting is more efficient.The performance improvement is easily an order of magnitude better (26x from initial benchmarks, attached) and works well for jobs with > 1000 steps.
Patched source is attached and also available on https://github.com/NaanProphet/spring-batch-large-step-perf-fix
Affects: 3.0.3
Attachments:
Referenced from: pull request #716, and commits 62a8f44
Backported to: 4.2.0.RC1
The text was updated successfully, but these errors were encountered: