Skip to content

Conversation

tomscut
Copy link
Contributor

@tomscut tomscut commented Aug 3, 2024

Description of PR

JIRA: HDFS-16557.

We set the EC policy to (6+3) and also have nodes that were in state ENTERING_MAINTENANCE.
When we move the data of some directories from SSD to HDD, some blocks move fail due to disk full, as shown in the figure below (blk_-9223372033441574269).
We tried to move again and found the following error "Replica does not exist".
Observing the information of fsck, it can be found that the wrong blockid(blk_-9223372033441574270) was found when moving block.

Mover Logs:
image

FSCK Info:
image

Root Cause:
Similar to this HDFS-16333, when mover is initialized, only the LIVE node is processed. As a result, the datanode in the ENTERING_MAINTENANCE state in the locations is filtered when initializing DBlockStriped, but the indices are not adapted, resulting in a mismatch between the location and indices lengths. Finally, ec block calculates the wrong blockid when getting internal block (see DBlockStriped#getInternalBlock).

Solution:
When initializing DBlockStriped, if any location is filtered out, we need to remove the corresponding element in the indices to do the adaptation.

How was this patch tested?

Pass the unit test.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 17m 26s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 1s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 48m 58s trunk passed
+1 💚 compile 1m 24s trunk passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 compile 1m 14s trunk passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 checkstyle 1m 13s trunk passed
+1 💚 mvnsite 1m 22s trunk passed
+1 💚 javadoc 1m 9s trunk passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 javadoc 1m 45s trunk passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 spotbugs 3m 20s trunk passed
+1 💚 shadedclient 40m 56s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 11s the patch passed
+1 💚 compile 1m 16s the patch passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 javac 1m 16s the patch passed
+1 💚 compile 1m 9s the patch passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 javac 1m 9s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 1m 1s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 80 unchanged - 0 fixed = 83 total (was 80)
+1 💚 mvnsite 1m 14s the patch passed
+1 💚 javadoc 0m 55s the patch passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 javadoc 1m 32s the patch passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 spotbugs 3m 18s the patch passed
+1 💚 shadedclient 41m 12s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 253m 15s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 48s The patch does not generate ASF License warnings.
423m 23s
Reason Tests
Failed junit tests hadoop.hdfs.TestRollingUpgrade
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6979/1/artifact/out/Dockerfile
GITHUB PR #6979
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 7876e0c54b9e 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 7ade5be
Default Java Private Build-1.8.0_422-8u422-b05-1~20.04-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_422-8u422-b05-1~20.04-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6979/1/testReport/
Max. process+thread count 3046 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6979/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants