-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HDFS-16432. Namenode block report add yield to avoid holding write lo… #3907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
💔 -1 overall
This message was automatically generated. |
0561a74
to
09824a5
Compare
💔 -1 overall
This message was automatically generated. |
@tasanuma @Hexiaoqiao Please take a look. |
09824a5
to
2367054
Compare
💔 -1 overall
This message was automatically generated. |
Thanks for opening a PR. As for the block report taking time -- is it still a problem after removing the folded tree set structure (HDFS-13671)? |
Thanks @jojochuang for your comment. The block report taking time we reported in this PR was obtained after merging HDFS-13671. I think that the block report time will become longer after merging HDFS-13671 because FoldedTreeSet have a better performance in dealing FBR. |
I am also concerned about yielding the lock in the middle of processing a storage. I would be interested to understand where the time is spend doing the block report processing inside the namenode, to see if we can improve the performance of the code and avoid yielding the lock. Have you tried reproducing this problem on a small otherwise idle cluster and attaching the async profiler to the namenode process to see where the code is taking time during block reporting? I think that analysis would be valuable before we change the locking, as there may be something which can be improved around the performance. |
Hi @sodonnel, I have tried reproducing this problem on a test cluster and the cpu profile of block report process look like as follow. It looks like most of the time were taking by the #processReportedBlock and #moveBlockToHead, and the #moveBlockToHead is necessary after removing the folded tree set structure. |
@liubingxing Could you attach the flame chart svg file to the Jira so I can look at it in more detail please? |
@sodonnel I attached the flame chart svg file to the Jira. |
@liubingxing sorry to interrupt, Have you ever tried this patch in your production clusters? |
JIRA: HDFS-16432
In our cluster, namenode block report will held write lock for a long time if the storage block number more than 100000. So we want to add a yield mechanism in block reporting process to avoid holding write lock too long.