Skip to content

Performance improvement for partitions with a large number of steps [BATCH-2716] #891

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
spring-projects-issues opened this issue May 2, 2018 · 8 comments
Labels
has: backports Legacy label from JIRA. Superseded by "for: backport-to-x.x.x" in: core related-to: performance type: enhancement
Milestone

Comments

@spring-projects-issues
Copy link
Collaborator

Krishna Bhamidipati opened BATCH-2716 and commented

Each time partitions are created, SimpleStepExecutionSplitter#split queries the Spring Batch tables to double check if the partitioned step is a restart. This originally scanned the entire table for each time.

The patched approach uses a new method on JobRepository which allows SimpleStepExecutionSplitter to retreive all the step executions at the time of partitioning. By assuming there are no other threads creating conflicting partitions for the same job instance, the splitting is more efficient.

The performance improvement is easily an order of magnitude better (26x from initial benchmarks, attached) and works well for jobs with > 1000 steps.

Patched source is attached and also available on https://github.com/NaanProphet/spring-batch-large-step-perf-fix


Affects: 3.0.3

Attachments:

Referenced from: pull request #716, and commits 62a8f44

Backported to: 4.2.0.RC1

@spring-projects-issues
Copy link
Collaborator Author

Krishna Bhamidipati commented

This is my first time creating a bug fix patch for Spring Batch, so any advice and guidance would be most appreciated!

@spring-projects-issues
Copy link
Collaborator Author

Mahmoud Ben Hassine commented

Krishna Bhamidipati That's aweswome! Thank you for this analysis and benchmarks! This issue has been planned for v4.2 for which the theme is "performance ++".

I checked the content of the attached zip file and there are some good ideas in there. Can you please open a PR on Github with those changes? Many thanks upfront.

@spring-projects-issues
Copy link
Collaborator Author

Krishna Bhamidipati commented

Thanks for reviewing the code Mahmoud Ben Hassine! I don't yet see a branch for 4.2.x on GitHub yet (only up to 4.1.x). Should I wait to submit the PR until the branch is created?

@spring-projects-issues
Copy link
Collaborator Author

Mahmoud Ben Hassine commented

4.2.x is on the master branch, so you can open a PR against it.

Thank you upfront!

@spring-projects-issues
Copy link
Collaborator Author

Mahmoud Ben Hassine commented

Krishna Bhamidipati We are planning to tackle this issue in the upcoming 4.2.0.RC1. Have you got a chance to open a PR with the changes in the zip file against the master branch?

Looking forward to your feedback.

@spring-projects-issues
Copy link
Collaborator Author

Krishna Bhamidipati commented

Mahmoud Ben Hassine oh that's great! Oops, forgot to complete my PR, but let me do that now.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues
Copy link
Collaborator Author

Mahmoud Ben Hassine commented

Resolved with 62a8f44

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
has: backports Legacy label from JIRA. Superseded by "for: backport-to-x.x.x" in: core related-to: performance type: enhancement
Projects
None yet
Development

No branches or pull requests

2 participants