-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-18094. Disable S3A auditing by default. #3916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-18094. Disable S3A auditing by default. #3916
Conversation
This patch does not fix the problem, but it allows for auditing to be disabled such that no ThreadLocal fields are created. By disabling auditing by default, memory leaks will not surface. Auditing to S3 Logs can be enabled if required; short-lived applications and/or applications which use the same limited set of S3A instances should be safe to use with auditing. Change-Id: I9090d65c896a12feb6826e581935078ffb737bab
Turning off logging by default is the fastest way to address this issue, while I do something better involving a thread id map with weak references to spans. and with a test to verify memory problems are avoided. running the full test suite with the default settings (i.e. auditing is disabled) |
two ARN failures from SDK update (noted on https://issues.apache.org/jira/browse/HADOOP-18085 ); other test failures are due to audit events not being collected. will address those by enabling auditing for those tests
|
Change-Id: I93e2211d4070f4222004a58b46f841360f74e554
btw, that was against s3 london with |
next run, same settings. only the arn tests failed
|
Would you remove the Change-Id from the commit message? (the removal can be done when squash-and-merge the commits) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, seems a good interim fix/ability to disable the auditing. But, shouldn't the title for this PR be to add a config to disable auditing rather than Memory leakage since that would be fixed in the next PR?
Building the tests, seeing some config issue I'll resolve in some time(Seems like a new aws-java-sdk was added, for some reason it's not resolving for me).
It provides two forms of logging | ||
|
||
1. Logging of operations in the client via Log4J. | ||
1. Logging of operations in the client via the active SLF4J imolementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo: "implementation"
|
||
|
||
### Disabling Auditing with the No-op Auditor | ||
### Disabling Auditing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a mention that servicename=NoopAuditor also disables auditing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't though. it stil leaks memory as the service manager is still instantiated.
which is why i've cut it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes, you're right. Just realized, ActiveAuditManager with NoopAuditor would still leak, but NoopAuditManager is the one without the ThreadLocal variable now. Thanks for clearing it 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding to the architecture as an FYI point
…ture doc Add an explanation there of why the memory is leaked. Change-Id: I65f853fb0b588a0b27e11b6b1291954cf52fb362
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good changes wise. Just one clarification.
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/auditing_architecture.md
Show resolved
Hide resolved
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1
thx. I'm going to creata a new JIRA for this "add option to disable s3a auditing" and commit this under that, wiith the original one open. |
See HADOOP-18091. S3A auditing leaks memory through ThreadLocal references
Adds a new option fs.s3a.audit.enabled to controls whether or not auditing
is enabled. This is false by default.
When false, the S3A auditing manager is NoopAuditManagerS3A,
which was formerly only used for unit tests and
during filsystem initialization.
When true, ActiveAuditManagerS3A is used for managing auditing,
allowing auditing events to be reported.
updates documentation and tests.
This patch does not fix the underlying leak. When auditing is enabled,
long-lived threads will retain references to the audit managers
of S3A filesystem instances which have already been closed.
Contributed by Steve Loughran.
Description of PR
How was this patch tested?
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?