Skip to content

Conversation

anujmodi2021
Copy link
Contributor

@anujmodi2021 anujmodi2021 commented Sep 4, 2023

Jira Ticket: https://issues.apache.org/jira/browse/HADOOP-18872

Description of PR

There was a bug identified where retry count in the client correlation id was wrongly reported for sub-sequential and parallel operations triggered by a single file system call. This was due to reusing same tracing context for all such calls.
We create a new tracing context as soon as HDFS call comes. We keep on passing that same TC for all the client calls.

For instance, when we get a createFile call, we first call metadata operations. If those metadata operations somehow succeeded after a few retries, the tracing context will have that many retry count in it. Now when actual call for create is made, same retry count will be used to construct the headers(clientCorrelationId). Alhough the create operation never failed, we will still see retry count from the previous request.

Fix is to use a new tracing context object for all the network calls made. All the sub-sequential and parallel operations will have same primary request Id to correlate them, yet they will have their own tracing of retry count.

How was this patch tested?

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@anujmodi2021 anujmodi2021 changed the title Hadoop 18872 tracing fix HADOOP-18872: [ABFS] [BugFix] Misreporting Retry Count for Sub-sequential and Parallel Operations Sep 4, 2023
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 40s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 7 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 43m 37s trunk passed
+1 💚 compile 0m 41s trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 38s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 0m 36s trunk passed
+1 💚 mvnsite 0m 44s trunk passed
+1 💚 javadoc 0m 43s trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 37s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 1m 11s trunk passed
+1 💚 shadedclient 33m 34s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 32s the patch passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 32s the patch passed
+1 💚 compile 0m 29s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 0m 29s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 22s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 8 new + 4 unchanged - 0 fixed = 12 total (was 4)
+1 💚 mvnsite 0m 30s the patch passed
-1 ❌ javadoc 0m 26s /results-javadoc-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04.txt hadoop-tools_hadoop-azure-jdkUbuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 generated 2 new + 15 unchanged - 0 fixed = 17 total (was 15)
-1 ❌ javadoc 0m 27s /results-javadoc-javadoc-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_382-8u382-ga-1~20.04.1-b05.txt hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_382-8u382-ga-120.04.1-b05 with JDK Private Build-1.8.0_382-8u382-ga-120.04.1-b05 generated 2 new + 15 unchanged - 0 fixed = 17 total (was 15)
+1 💚 spotbugs 1m 4s the patch passed
+1 💚 shadedclient 34m 8s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 17s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 41s The patch does not generate ASF License warnings.
128m 24s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6019/1/artifact/out/Dockerfile
GITHUB PR #6019
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux fd343a23cc0b 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / dc2f952
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6019/1/testReport/
Max. process+thread count 735 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6019/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@anujmodi2021 anujmodi2021 force-pushed the HADOOP-18872-tracingFix branch from dc2f952 to 140c78b Compare September 27, 2023 04:59
@anujmodi2021 anujmodi2021 marked this pull request as ready for review September 27, 2023 05:00
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 38s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 7 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 44m 58s trunk passed
+1 💚 compile 0m 43s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 40s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 0m 36s trunk passed
+1 💚 mvnsite 0m 44s trunk passed
+1 💚 javadoc 0m 42s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 39s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 1m 9s trunk passed
+1 💚 shadedclient 33m 35s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 30s the patch passed
+1 💚 compile 0m 32s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 32s the patch passed
+1 💚 compile 0m 28s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 0m 28s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 21s the patch passed
+1 💚 mvnsite 0m 32s the patch passed
+1 💚 javadoc 0m 29s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 27s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 1m 4s the patch passed
+1 💚 shadedclient 34m 7s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 17s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 40s The patch does not generate ASF License warnings.
130m 11s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6019/2/artifact/out/Dockerfile
GITHUB PR #6019
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 30c41dd1da6b 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 140c78b
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6019/2/testReport/
Max. process+thread count 703 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6019/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apart from that javadoc nit (some jvms blow up there) no changes to production code. test wise just make sure file are closed and filesystems too. Mockito tests are always too complex to review/maintain, so I won't commment there -at a glance they look ok

private String failureReason;

/**
* This variable stores the tracing context used for last Rest Operation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a . so all javadoc versions are happy. some JVMs blow up here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taken

spiedStore.setClient(spiedClient);

fs.mkdirs(new Path("/testDir"));
fs.create(new Path("/testDir/file1"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add .close() or is mockito so involved these are no-ops?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added fs.close()

@anujmodi2021
Copy link
Contributor Author

Thanks for the review @steveloughran
Added fs.close() wherever applicable or enclosed FS creation inside a try.

@steveloughran
Copy link
Contributor

reviewing
@anujmodi2021 can you avoid doing rebase/forced push when we are in those final review stages; as long as there are no merge problems it helps reviewers as we can easily review changes since their last review. thanks

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 25s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 7 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 43s trunk passed
+1 💚 compile 0m 30s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 27s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 0m 26s trunk passed
+1 💚 mvnsite 0m 32s trunk passed
+1 💚 javadoc 0m 31s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 28s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 0m 48s trunk passed
+1 💚 shadedclient 20m 3s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 21s the patch passed
+1 💚 compile 0m 22s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 22s the patch passed
+1 💚 compile 0m 20s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 0m 20s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 15s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 1 new + 4 unchanged - 0 fixed = 5 total (was 4)
+1 💚 mvnsite 0m 22s the patch passed
+1 💚 javadoc 0m 20s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 19s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 0m 42s the patch passed
+1 💚 shadedclient 19m 54s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 53s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 28s The patch does not generate ASF License warnings.
85m 38s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6019/3/artifact/out/Dockerfile
GITHUB PR #6019
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 827c40174939 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 25cc6d9
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6019/3/testReport/
Max. process+thread count 557 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6019/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afraid I went and reviewed the whole patch, because github didn't give me much choice. so added commentary about method names/javadocs in AbfsClientTestUtil. sorry. and next time, make it easier for both of us by not rebasing, please

List<FileStatus> fileStatuses = new ArrayList<>();
spiedStore.listStatus(new Path("/"), "", fileStatuses, true, null, spiedTracingContext);

// Assert that there were 2 paginated ListPath calls were made.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this really verify 2 calls, or 1? I don't care which, only that the comment is consistent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 calls were made one with continuation token and one without continuation token

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comments to remove confusion.

@anujmodi2021
Copy link
Contributor Author

Hi @steveloughran
Really sorry for making it difficult for you. There were some merge conflicts that I wanted to resolve.
I later learned that instead of using a git rebase, I should have used git merge.

I will keep this in mind for all my future PRs. Kindly request you to please take the effort this time and review the PRs.
Apologies again.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 27s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 7 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 31m 44s trunk passed
+1 💚 compile 0m 30s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 27s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 0m 25s trunk passed
+1 💚 mvnsite 0m 31s trunk passed
+1 💚 javadoc 0m 31s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 28s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 0m 48s trunk passed
+1 💚 shadedclient 20m 25s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 21s the patch passed
+1 💚 compile 0m 22s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 22s the patch passed
+1 💚 compile 0m 20s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 0m 20s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 15s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 1 new + 4 unchanged - 0 fixed = 5 total (was 4)
+1 💚 mvnsite 0m 22s the patch passed
+1 💚 javadoc 0m 20s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 19s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 0m 42s the patch passed
+1 💚 shadedclient 21m 17s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 54s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 29s The patch does not generate ASF License warnings.
85m 46s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6019/4/artifact/out/Dockerfile
GITHUB PR #6019
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux fbb3fdc4b378 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 213cdaa
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6019/4/testReport/
Max. process+thread count 573 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6019/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor

Really sorry for making it difficult for you.
no worries -I need to apologise for not giving the abfs code enough attention, either in reviews or my own work

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
+1

@steveloughran steveloughran merged commit 000a39b into apache:trunk Nov 13, 2023
@steveloughran
Copy link
Contributor

merged! Anuj -can you do a branch-3.3 backport and retest...mukund has been getting his signing setup for a 3.3.x release

jiajunmao pushed a commit to jiajunmao/hadoop-MLEC that referenced this pull request Feb 6, 2024
…tial and Parallel Operations (apache#6019)


Contributed by Anuj Modi
@anujmodi2021 anujmodi2021 deleted the HADOOP-18872-tracingFix branch March 1, 2024 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants