-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-18872: [ABFS] [BugFix] Misreporting Retry Count for Sub-sequential and Parallel Operations #6019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-18872: [ABFS] [BugFix] Misreporting Retry Count for Sub-sequential and Parallel Operations #6019
Conversation
💔 -1 overall
This message was automatically generated. |
dc2f952
to
140c78b
Compare
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apart from that javadoc nit (some jvms blow up there) no changes to production code. test wise just make sure file are closed and filesystems too. Mockito tests are always too complex to review/maintain, so I won't commment there -at a glance they look ok
private String failureReason; | ||
|
||
/** | ||
* This variable stores the tracing context used for last Rest Operation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a . so all javadoc versions are happy. some JVMs blow up here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken
spiedStore.setClient(spiedClient); | ||
|
||
fs.mkdirs(new Path("/testDir")); | ||
fs.create(new Path("/testDir/file1")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add .close() or is mockito so involved these are no-ops?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added fs.close()
Thanks for the review @steveloughran |
140c78b
to
25cc6d9
Compare
reviewing |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
afraid I went and reviewed the whole patch, because github didn't give me much choice. so added commentary about method names/javadocs in AbfsClientTestUtil. sorry. and next time, make it easier for both of us by not rebasing, please
List<FileStatus> fileStatuses = new ArrayList<>(); | ||
spiedStore.listStatus(new Path("/"), "", fileStatuses, true, null, spiedTracingContext); | ||
|
||
// Assert that there were 2 paginated ListPath calls were made. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this really verify 2 calls, or 1? I don't care which, only that the comment is consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 calls were made one with continuation token and one without continuation token
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the comments to remove confusion.
...ls/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientTestUtil.java
Outdated
Show resolved
Hide resolved
...ls/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientTestUtil.java
Outdated
Show resolved
Hide resolved
Hi @steveloughran I will keep this in mind for all my future PRs. Kindly request you to please take the effort this time and review the PRs. |
🎊 +1 overall
This message was automatically generated. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
+1
merged! Anuj -can you do a branch-3.3 backport and retest...mukund has been getting his signing setup for a 3.3.x release |
…tial and Parallel Operations (apache#6019) Contributed by Anuj Modi
…tial and Parallel Operations (apache#6019) Contributed by Anuj Modi
Jira Ticket: https://issues.apache.org/jira/browse/HADOOP-18872
Description of PR
There was a bug identified where retry count in the client correlation id was wrongly reported for sub-sequential and parallel operations triggered by a single file system call. This was due to reusing same tracing context for all such calls.
We create a new tracing context as soon as HDFS call comes. We keep on passing that same TC for all the client calls.
For instance, when we get a createFile call, we first call metadata operations. If those metadata operations somehow succeeded after a few retries, the tracing context will have that many retry count in it. Now when actual call for create is made, same retry count will be used to construct the headers(clientCorrelationId). Alhough the create operation never failed, we will still see retry count from the previous request.
Fix is to use a new tracing context object for all the network calls made. All the sub-sequential and parallel operations will have same primary request Id to correlate them, yet they will have their own tracing of retry count.
How was this patch tested?
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?