Skip to content

Get the logger to respect module hierarchy as a normal logger #97

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alexanderluiscampino opened this issue Jul 31, 2020 · 9 comments
Closed
Assignees
Labels
feature-request feature request

Comments

@alexanderluiscampino
Copy link

alexanderluiscampino commented Jul 31, 2020

Feature request to have this logger have the same multiple module inheritance present on the normal logging module.

When we create a normal log on the main.py, then subsequence invocations of logging.getLogger(__name__) on submodule will inherit the handlers and levels set on the main.py, due to the singleton nature of the logging.Logger class , hence using the logging.getLogger(name) to get the correct instance of the logger.

This allows for any log message in sub-modules to have the same characteristics a the parent logger. In this specific case, since this formatter transforms everything into a nice JSON with the possibility of adding structure logs, would be nice for those structure logs to persist and be inherited by the child logging instances.

If I need to create another logger = Logger() this will indeed create a whole new instance which has no relationship with the logger on main.py, hence all the structure logs setup on main are now lost and need to be recreated, besides being repetitive, it might not be possible.

For example, these two logs, one from the lambda handler, another one from a submodule. The runID was lost, since it was added via logger.structure_logs(append=True, runID=run_id)

{"timestamp": "2020-07-31 06:29:32,699", "level": "INFO", "location": "lambda_handler:46", "service": "myService", "sampling_rate": 0.0, "cold_start": true, "function_name": "name, "function_memory_size": "512", "runID": "1b250a6732", "message": "my message on main.py"}
{"timestamp": "2020-07-31 06:29:33,161", "level": "INFO", "location": "start_execution:49", "service": "core.handlers.handler", "sampling_rate": 0.0, "message": "my message on a sub module"}

It would be ideal that some basic keys would be inherited, such that querying the logs afterwards can be easy, such as getting all logs where runID = 'abc'.

Re setup the structure_logs on every module is not possible, since modules might be agnostic to the concept of runID

@alexanderluiscampino alexanderluiscampino added feature-request feature request triage Pending triage from maintainers labels Jul 31, 2020
@ghost ghost removed the triage Pending triage from maintainers label Jul 31, 2020
@heitorlessa
Copy link
Contributor

heitorlessa commented Aug 1, 2020

Hey @alexanderluiscampino - Thanks for opening a feature request!

Background

As you pointed out, Logger doesn't follow a compliant hierarchy because we don't use getLogger under the hood - It's not a singleton per se as we return different instances.

We subclass logging.Logger and initialize a Logger using the service name - This has the pros of us being able to return you a simplified Logger with additional methods, but it has the cons that I discovered recently that a Logger is only truly registered[1] when using getLogger.

I've tried a few different approaches like setLoggerClass, an additional property (self.logger) and method to return a registered Logger, etc. I've had issues with all of them either on the UX we wanted to provide, or having to override how Logging captures the stack frames to correctly identify where a log statement was called.

Ideas

If you have ideas on how we could comply with hierarchy in logging while maintaining backwards compatibility in the UX, please I'd love if you could craft a RFC - This is something I want to improve but am focusing on the utility areas atm.

That said, if we can't find a solution to meet this criteria ^, we can solve this problem differently using something like Borg to share configuration between instances, while additional instances can have additional or override existing configuration -- We do this in Tracer today.

:)


[1] Logger not being registered in its hierarchy due to not calling getLogger:

from aws_lambda_powertools import Logger
import logging
from pprint import pprint

logger_one = Logger(service="payment") # <Logger payment (INFO)>
logger_two = Logger(service="payment.two") <Logger payment.two (INFO)>

# Print all registered Loggers within logging root manager
# You won't find `payment` neither `payment.two` there
pprint(logging.root.manager.loggerDict)

# As `getLogger` wasn't called it doesn't correctly define/fix children either
pprint(logger_two.parent)

@heitorlessa
Copy link
Contributor

I just had an idea on how to tackle this while keeping the same UX and respecting hierarchy in a nicer way - It'll involve:

  • Overriding findCaller from the CPython Logging implementation
  • Include a self.logger = getLogger... as well as info, error, debug, etc. methods
  • Capturing upper frames to identify where the log statement was actually executed
  • Create a logger name based on source frame (ala getLogger(__name__))

I'll thinker on that on Thursday or Friday.

Still happy to hear additional ideas here

@alexanderluiscampino
Copy link
Author

Hey @heitorlessa , I'm on vacation this week, but willing to help however I can next coming week. I have created a similar version for the logger for my projects that pretty much is what you described above. Its just a class which does not inherit from the logging.Logger, allowing the normal behavior to proceed.

I do a self.logger = logging.getLogger() and apply the JSON formatter you created. Something of the sort:

class LambdaLogger(LambdaDecorator):
    """
    General AWS Lambda Loggers Utility Tool
    Handles setting a JSON lambda formatter for logging to cloudwatch,
    utilizing lambda root logger.
    JSON format on logs its useful due to its easy indexation capabilities
    via the query language in Cloudwatch Insights.

    This log is meant to be used instead of the normal logging module,
    since it only changes its format and adds methods to update the
    JSON keys outputted.

    Usage example:
        >> lambda_logger = LambdaLogger(service='myService', level='warning', app='myApp')
        >> logger = lambda_logger.get_logger()
        >> @lambda_logger.log_everything
        >> def lambda_handler(event, context):
        >>    lambda_logger.update_attributes(runID='3cnu8930a48')
        >>    logger.info("My First Log")

    """

    def __init__(self, level=logging.INFO, **kwargs):
        """
        Create instance of LambdaLogger

        Args:
            level ([str, int, logging.Level]): Logging level to apply. Defaults to logging.INFO.
        """
        self.logger = logging.getLogger()
        if isinstance(level, str):
            level = level.upper()
        level = level or os.getenv("LOG_LEVEL").upper()
        self.logger.setLevel(level)
        self.formatter = LambdaFormatter(**kwargs)
        self.apply_formatter()

    def get_logger(self):
        return self.logger

    @property
    def log_everything(self):
        """
        Decorators for lambda handler to log event trigger, context and return value
        """
        return LogEverything

    def update_attributes(self, **kwargs):
        """
        Updates list of log JSON key attributes
        """
        self.formatter.add_log_attributes(**kwargs)
        self.apply_formatter()

    def apply_formatter(self):
        """
        Applies formatter to all handlers available
        """
        for handler in self.logger.handlers:
            handler.setFormatter(self.formatter)

I am sure there are things i'm missing out with this approach in comparison to yours.

@heitorlessa
Copy link
Contributor

After some research I think I found the solution while keeping backward compatibility, performance, and UX the same.

Challenge I had with implementing this safely was that the stack frame of the caller (who issued: logger.info/debug) was pointing at the Logger class, not the caller itself.

There was a lot of code to change back then to make this work which I felt it was unnecessary, as I didn't have any customer feedback on that yet to justify.

Luckily 'stacklevel' logging parameter was introduced in CPython 3.8 to solve this problem, and keep our "location" key intact -- I just need to backport this to support 3.6 and 3.7:

https://github.com/python/cpython/pull/7424/files

I'll give this a try on the weekend

Thanks for the patience so far :)

@heitorlessa heitorlessa mentioned this issue Aug 9, 2020
35 tasks
@heitorlessa
Copy link
Contributor

hey @alexanderluiscampino - When you get a chance, could you please have a look at the PR linked in this issue?

I've managed to find a more elegant solution without compromising UX, backwards compatibility, and the dreaded location with less code.

I'll be writing the docs next if that works for you.

Thanks!

@alexanderluiscampino
Copy link
Author

@heitorlessa , I'm in the process of reviewing it!

@heitorlessa
Copy link
Contributor

heitorlessa commented Aug 11, 2020 via email

@heitorlessa heitorlessa mentioned this issue Aug 14, 2020
6 tasks
@heitorlessa
Copy link
Contributor

@alexanderluiscampino This is now released under 1.1.0 version in PyPi 🥳 🎉

Docs: https://awslabs.github.io/aws-lambda-powertools-python/core/logger/#reusing-logger-across-your-code

Thank you once again for the suggestions and reviewing ;)

@manmartgarc
Copy link

@alexanderluiscampino This is now released under 1.1.0 version in PyPi 🥳 🎉

Docs: https://awslabs.github.io/aws-lambda-powertools-python/core/logger/#reusing-logger-across-your-code

Thank you once again for the suggestions and reviewing ;)

Link is not working; here is the fixed link.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request feature request
Projects
Status: Shipped
Development

No branches or pull requests

3 participants