Skip to content

The aws_lambda_powertools Logger breaks Python's standard library logging functionality. #194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
techdragon opened this issue Oct 16, 2020 · 6 comments
Labels
bug Something isn't working

Comments

@techdragon
Copy link

techdragon commented Oct 16, 2020

Enabling normal logging from dependent python libraries that use the standard library logging mechanism does not work.

Expected Behavior

Setting the log level of a package you are using in your code, should produce the appropriate log outputs.
For example:

import logging
logging.getLogger('boto3').setLevel(logging.DEBUG)

should result in the boto3 library producing logs.

Current Behavior

This does nothing when using the aws_lambda_powertools due to the Logger removing the standard library root logging handler.

Possible Solution

  1. There are existing libraries for building complex, customisable and powerful logging workflows, with both JSON formatting and context injection on top of the standard library logging. Two prominent examples are:

  2. The combination of heitorlessa@6b8be5d
    and the solution to Get the logger to respect module hierarchy as a normal logger #97 currently only allows for the aws_lambda_powertools Logger to handle logging from code "in" the lambda or written directly for use in a lambda. It silences all standard library logging.

  3. @heitorlessa stated in Get the logger to respect module hierarchy as a normal logger #97 that you wanted to maintain backwards compatibility

If you have ideas on how we could comply with hierarchy in logging while maintaining backwards compatibility in the UX,

then went on to suggest that if this cannot be done

we can solve this problem differently using something like Borg to share configuration between instances, while additional instances can have additional or override existing configuration

I've done a fair amount of logging code in the past ( a public example here ) and in light of points 1,2 and 3 I personally would cut my losses, follow semantic versioning, bump the major version number and replace the entire logging functionality with something new. Python logging has established patterns and while to quote @heitorlessa "the UX we wanted to provide" may have compelling business or developer experience reasoning behind it... your chosen approach appears to be at odds with the way Python's standard library logging works.

Other than "not breaking backwards compatibility"... Are you able to more thoroughly explain the constraints you used to decide on the current logging approach?

Since I'm looking at writing my own replacement for the aws_lambda_powertools Logger since it doesn't support standard library logging and I'd be willing to submit it as a PR, I have two key questions to help shape what I write.

  1. Do you have any restrictions on what dependencies you can accept in this project?
  2. What was the UX you wanted to provide by choosing the current logger design?

Steps to Reproduce (for bugs)

This is only an example using boto3, all dependent libraries using standard library logging will have the same behaviour.

  1. Add code like this before using the boto3 library.
import logging
logging.getLogger('boto3').setLevel(logging.DEBUG)
  1. Try to do something with the boto3 library.

Environment

  • Powertools version used: v1.6.1
  • Packaging format (Layers, PyPi): Both
  • AWS Lambda function runtime: Python 3.8
  • Debugging logs Not applicable here.

How to enable debug mode**

# paste logs here
@techdragon techdragon added bug Something isn't working triage Pending triage from maintainers labels Oct 16, 2020
@heitorlessa
Copy link
Contributor

heitorlessa commented Oct 16, 2020

Hi @techdragon - I truly appreciate your extensive research on this matter. I'll write a more comprehensive reply with references sometime next week, as I'm on PTO this week - I'm replying regardless because I understand this is crucial to you.

logging.getLogger('boto3').setLevel(logging.DEBUG) code expects the Root Logger to be configured, since Python Logging recommends all libraries to setup their loggers with a Null Handler.

For this reason, boto and ourselves provide a mechanism to configure a Logger Handler to not mess with the Root Logger: boto3.set_stream_logger(), or a more verbose mode boto3.set_stream_logger(name="botocore"), since the vast majority of boto logs flow through botocore logger.

As regards to Root Logger - Yes, we do remove it during Logger initialization, as you pointed out, to prevent application duplicate logs due to the log propagation system.

As regards to UX - That's primarily due to UX of allowing child loggers propagate their changes to their parents (logging propagation system). If Root Logger is enabled, app child logger logs will appear twice, as it'll flow through its parent logger, then to the Root Logger.

That said, if there's a mechanism we could use to provide child and parent logger to propagate their changes to one another while not touching Lambda Root Logger I'm all for it, as I might have missed something and am always up to learn something new.

Update: Perhaps a Log Filter in the root logger could solve this

@heitorlessa
Copy link
Contributor

If you have the bandwidth, PR above is ready for review @techdragon with the fix. It will be available early next week, as we prepare to launch the new parser utility too.

Had a demanding week so I couldn't expand on my previous comments - Regardless, this PR adds a logging filter to instruct the root logger to reject any log records propagated from child loggers. This fix is backwards compatible.

Thanks

@heitorlessa heitorlessa added the pending-release Fix or implementation already in dev waiting to be released label Oct 24, 2020
@heitorlessa
Copy link
Contributor

Hi @techdragon - This is now fixed in 1.7.0 release (9 days ago). Allow me to expand in your research for future context.

Use of additional libraries. Not at this stage. Structlog was the most prominent when we were building it given how widely used it is. When drafting our tenets, we ended up removing most of the libraries and leaving only the bare minimum necessary. That said, we'll be investigating the use of implicit namespaces (PEP 420) to allow each utility to bring lightweight libraries - That's when I want to reconsider structlog specifically the pipeline feature.

Silencing all standard library logging. When I implemented the Child logger feature I had misunderstood the end-to-end behaviour of log record propagation - That is, child loggers don't have a handler, so their log records would propagate twice, first to their parent then subsequently to the root logger. In the ideal world, root logger shouldn't be configured but an application logger, including Libraries setting their NullHandler - Except Lambda does configure the Root Logger, and hence why that excerpt you shared works in the Lambda env, just not in EC2 or any other that doesn't purposefully configure the root logger.

The Child Logger UX and root of Root Logger being removed. By default, child loggers (parent.child) have propagation property set to True. This is great because it allows multiple loggers within an application to share the formatting and handler behaviour - hence the UX I wanted to preserve. Within Lambda, customers tend to either stick with a single file (e.g. handler.py) or having logic across multiple files. Challenge they had with multiple files was a) not being able to inherit the formatter and handler from the parent as the relationship wasn't there, and b) propagate keys that were conditionally added in logic outside the entry file.

Both challenges were addressed with the child property. This allowed us to create and fix a relationship with the parent Logger through the dotted naming convention, {service}.{filename} used by std lib, where the parent logger would always be named service. This however caused the challenge you noted since child log records were being propagated to the root logger resulting in duplicated entries emitted by the child logger - I therefore removed the Root logger as I wrongly assumed it wasn't a good practice regardless until your issue was raised... which is now fixed with a Logger filter implemented 9 days ago.

This Logger filter is added to the Root logger, which rejects any log records propagated from Powertools Loggers named after service - This ensures any child Logger however many log records are rejected while keeping the same bi-directional parent<->child logger changes UX I wanted to keep w/o removing the root logger.

Back to the snippet you shared now:

import logging
logging.getLogger('boto3').setLevel(logging.DEBUG)

This will now work in tandem with Powertools since the Root Logger configured by Lambda at bootstrap will continue to have a Stream Handler. If you or anyone reading this issue wants to actually enable debug logs in Boto, then the correct way of doing so is boto3.set_stream_logger(), and boto3.set_stream_logger('botocore') since that will add a handler to boto3 and botocore loggers, and botocore is where many of the logs come from.

Since this is now fixed, I'm now closing this issue - If you however find your questions 1 and 2 aren't answered I'd be happy to expand on them.

Thank you again for raising this issue

@Nr18
Copy link

Nr18 commented Nov 2, 2020

@heitorlessa that was a good read to gain some background on the logger, does it make sense to add the boto3.set_stream_logger() and boto3.set_stream_logger('botocore') you mentioned to the documentation?

@heitorlessa
Copy link
Contributor

@heitorlessa that was a good read to gain some background on the logger, does it make sense to add the boto3.set_stream_logger() and boto3.set_stream_logger('botocore') you mentioned to the documentation?

Hey @Nr18 - Why not? At first I was against, but it's not mentioned in the boto docs either unless you open an issue with them (or search through).

Wanna create a FAQ section in the logger.mdx?

I'd be happy to review and merge it, or I can do that on Friday if you could a dedicated issue for that ;)

@Nr18
Copy link

Nr18 commented Nov 2, 2020

@heitorlessa sure, so I did an initial attempt, see #202 I would love to add some more context to the "Why don't I see my library logs?" section but I don't know the details and choices made without diving in the codebase so hoping on some input from you on that part ;-)

@heitorlessa heitorlessa removed the pending-release Fix or implementation already in dev waiting to be released label Nov 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Development

No branches or pull requests

3 participants