Cant see traceback #1340

LukenAgile42 · 2019-08-22T10:19:50Z

class FooConsumer(SyncConsumer):
    def connect(self):
        assert False

    ...

produces

WebSocket HANDSHAKING /foo/ [127.0.0.1:00000]
WebSocket DISCONNECT /foo/ [127.0.0.1:00000]

With no traceback

python==3.6.8
django==1.11.16
channels==2.2.0
daphne==2.1.0
channels-redis==2.1.1
asgi-redis==1.4.3
asgiref==2.2.0

macOS Mojave 10.14.1

When I try to reproduce this with a minimal setup, it gives traceback

Exception inside application: 
  File "/path/to/django_sandbox/venv/lib/python3.6/site-packages/channels/sessions.py", line 183, in __call__
 ... 
  File "/path/to/django_sandbox/foo/consumers.py", line 7, in connect
    assert False

as expected.

Adding a print

logger.error("Exception inside application: %s", exception_output)
print(exception_output)

inside
daphne.server.Server.application_checker
allows me to see the output, so it seems daphne swallows the errors somehow.

The text was updated successfully, but these errors were encountered:

Ivo-Donchev · 2019-10-09T11:20:31Z

I'm having the same issue: if anything inside the socket fails there's no stacktrace in the logs.

@LukenAgile42 We've implemented a really simple class decorator (just for easier debugging locally) that basically catches any exception inside the current class and prints the stacktrace:

def debuggable_function_decorator(func):
    from functools import wraps
    import traceback

    @wraps(func)
    def wrapper(*arg, **kwargs):
        try:
            return func(*arg, **kwargs)
        except Exception as exc:
            print(traceback.format_exc())
            raise

    return wrapper


def debuggable_class_decorator(cls):
    for attr, value in vars(cls).items():
        if callable(value):
            setattr(cls, attr, debuggable_function_decorator(value))

    return cls


class debugmeta(type):
    def __new__(cls, clsname, bases, clsdict):
        return debuggable_class_decorator(
            super().__new__(cls, clsname, bases, clsdict)
        )


class SomeConsumer(metaclass=debugmeta, JsonWebsocketConsumer):
    pass

However it would be great if someone take a look ! 🙂

ambsw-technology · 2019-10-22T21:23:34Z

We just ran into a comparable issue. On a test system, it was due to a missing HOSTS entry for the Redis server. We're seeing the same behavior on a prototype production system but it sounds like what we're seeing is so non-specific that the actual error could be pretty much anything in the consumer.

EDIT: To be clear, we're getting the slightly more interesting:

24.13.66.23:443 - - [22/Oct/2019:11:31:12] "WSCONNECTING /ws" - -
24.13.66.23:443 - - [22/Oct/2019:11:31:12] "WSCONNECT /ws" - -
24.13.66.23:443 - - [22/Oct/2019:11:31:12] "WSDISCONNECT /ws" - -
# ... rinse repeate

EDIT2: In our prototype production instance (AWS), the problem was that TransitEncyption defaults to "true" in the Elasticache Redis template we were using. We weren't using rediss:// and were accessing the server using a CNAME that didn't match the certificate.

Obviously, the details are less important than the fact that none of these error are bubbling up into logging (Django or otherwise).

carltongibson · 2019-10-23T20:01:57Z

OK, so there's clearly logging lacking here.

Question: if you dig-in, or use @Ivo-Donchev's suggestion, where are the errors occurring? Can we add logging in the right places there to surface the issues?

(Anyone want to put together a minimal project that replicates this that we can play with?)

ambsw-technology · 2019-10-27T03:31:49Z

OK. If I wrap my receive logic in a try-catch block, I get the exception. If I reraise it and trace it, it climbs the following stack:

If I watch the error, it reaches concurrent/futures/thread.py:59 where it calls self.future.set_exception(exc). In PyCharm, the error eventually bubbles to my Console:

ERROR 2019-10-26 22:27:03,135 server 16752 22660 Exception inside application: Multiple exceptions: [Errno 10061] Connect call failed ('::1', 6379, 0, 0), [Errno 10061] Connect call failed ('127.0.0.1', 6379)
  File "C:\Users\<User>\Envs\workflow_admin_dev\lib\site-packages\channels\sessions.py", line 183, in __call__
    return await self.inner(receive, self.send)
  File "C:\Users\<User>\Envs\workflow_admin_dev\lib\site-packages\channels\middleware.py", line 41, in coroutine_call
    await inner_instance(receive, send)
  File "C:\Users\<User>\Envs\workflow_admin_dev\lib\site-packages\channels\consumer.py", line 59, in __call__
    [receive, self.channel_receive], self.dispatch
  File "C:\Users\<User>\Envs\workflow_admin_dev\lib\site-packages\channels\utils.py", line 58, in await_many_dispatch
    await task
  File "C:\Users\<User>\Envs\workflow_admin_dev\lib\site-packages\channels_redis\core.py", line 429, in receive
    real_channel
  File "C:\Users\<User>\Envs\workflow_admin_dev\lib\site-packages\channels_redis\core.py", line 484, in receive_single
    index, channel_key, timeout=self.brpop_timeout
  File "C:\Users\<User>\Envs\workflow_admin_dev\lib\site-packages\channels_redis\core.py", line 324, in _brpop_with_clean
    async with self.connection(index) as connection:
  File "C:\Users\<User>\Envs\workflow_admin_dev\lib\site-packages\channels_redis\core.py", line 820, in __aenter__
    self.conn = await self.pool.pop()
  File "C:\Users\<User>\Envs\workflow_admin_dev\lib\site-packages\channels_redis\core.py", line 70, in pop
    conns.append(await aioredis.create_redis(**self.host, loop=loop))
  File "C:\Users\<User>\Envs\workflow_admin_dev\lib\site-packages\aioredis\commands\__init__.py", line 175, in create_redis
    loop=loop)
  File "C:\Users\<User>\Envs\workflow_admin_dev\lib\site-packages\aioredis\connection.py", line 108, in create_connection
    timeout, loop=loop)
  File "C:\Python37_64\Lib\asyncio\tasks.py", line 388, in wait_for
    return await fut
  File "C:\Users\<User>\Envs\workflow_admin_dev\lib\site-packages\aioredis\stream.py", line 19, in open_connection
    lambda: protocol, host, port, **kwds)
  File "C:\Python37_64\Lib\asyncio\base_events.py", line 957, in create_connection
    ', '.join(str(exc) for exc in exceptions)))
  Multiple exceptions: [Errno 10061] Connect call failed ('::1', 6379, 0, 0), [Errno 10061] Connect call failed ('127.0.0.1', 6379)

I'm not clear why a production runner doesn't behave the same way.

Ivo-Donchev · 2019-11-01T21:15:34Z

Hey guys 🙂 ,

I think I found what's the issue 🔥 🎉

So, what I did:

I created a super basic django project (as @carltongibson suggested) and start reproducing the bug - https://github.com/HackSoftware/django-channels-playground
I couldn't reproduce in this clean state ❌
So I reproduce it in the project we first hit it and started slicing things from the code
I turned out that django-debug-toolbar was causing the issue in our case :( So it's not a channels issue

I searched in the other issues and found this - #1193 (there's a suggested quickfix there)

I started debugging the django-debug-toolbar code to find the root of the problem and turned out that this class is causing it - https://github.com/jazzband/django-debug-toolbar/blob/87eb27ffb2d0aea60b8b139360ece4d9fc91b47e/debug_toolbar/utils.py#L222

It's used by a code hooked to the python logging module and collects only the logs from the current thread (which channels is not part from) 😞

#1340 (comment) - this was solving the problem because the print is forcing a message to the standart output rather than using the python logging module

Hope this will help someone ^

ambsw-technology · 2019-11-01T21:49:13Z

We're not using the debug toolbar. However, it certainly could be an issue with other middleware.

carltongibson · 2019-11-05T11:07:06Z

Hi @Ivo-Donchev — Thanks for the follow-up! Super.
(I'll leave this open for now, because I want to step through it myself when I have the moment.)

Pithikos · 2020-02-12T11:05:38Z

This issue still persists. Anyone who found a workaround? Not using the debug toolbar and this makes it working blindly.

LukenAgile42 · 2020-02-12T16:05:12Z

@Ivo-Donchev
I tried your debug decorator, but it didn't produce any output.

I also removed django-debug-toolbar, but the issue is still there.

Pithikos · 2020-02-12T16:37:20Z

Something else that worked for me is to set the root logger explicitly in settings:

LOGGING = {
        ..
        "loggers": {
            "": { # Root logger
                "handlers": ["console"],
                "level": "INFO",
            },
           ..
      ..
}

Ivo-Donchev · 2020-02-12T17:43:28Z

@Ivo-Donchev
I tried your debug decorator, but it didn't produce any output.

I also removed django-debug-toolbar, but the issue is still there.

@LukenAgile42 Hmm. Can I see your code? (if It's open source of course 😄)

LukenAgile42 · 2020-02-12T18:00:31Z

Its closed source, but my usage was just

class FooConsumer(AsyncJsonWebsocketConsumer, metaclass=debugmeta):
    def async connect(self):
         assert False

I tried it while inheriting from JsonWebsocketConsumer aswell.

LukenAgile42 · 2020-02-12T18:04:49Z

I stripped away all middleware and apps, except for some core django ones, but the logging was still broken. Must be an infrastructure issue somehow.

carltongibson · 2020-02-12T18:16:23Z

If this is really a logging issue, try installing logging_tree https://pypi.org/project/logging_tree/ and then inspecting the actual loggers in the connect function. Put a breakpoint in. Log something, is it displayed? (Sometimes a dictConfig will override all the existing loggers, so you don't get the config you thought you were.)

I tried your debug decorator, but it didn't produce any output.

The decorator is print()-ing so why isn't stdout being displayed? (The thought being that maybe the exception is being swallowed somewhere, but even if so the print() should do it's job. What's stdout?)

Without a minimal reproduce it's really hard to see what can be happening.

LukenAgile42 · 2020-02-12T18:17:43Z

@carltongibson

No stdout. Just

WebSocket HANDSHAKING /foo/ [127.0.0.1:00000]
WebSocket DISCONNECT /foo/ [127.0.0.1:00000]

as before.

Ivo-Donchev · 2020-02-13T07:34:48Z

@LukenAgile42 Can you setup a minimal django app so we can see it reproduced in isolation? I'd love to play with it and help fixing it.

ivanff · 2020-04-14T10:41:05Z

+1 have problem

ambsw-technology · 2020-04-22T02:56:42Z

I keep hoping to actually build an example of this for testing, but haven't found time yet. I wonder if it's enough to analyze the stack trace I posted above.

Shouldn't we expect to log (and possibly reraise) the error at some point in that process? It seems to me that a logical place to do so is at the "boundary" between packages. It may not be fair to blame a 3rd party package per se for an error at a boundary, but it's certainly far more likely to have an error e.g. when incompatible changes are made to the base package.

In this case, channels.utils.await_many_dispatch seems to wrap activities from other modules (channels_redis above but presumably others). It seems likely that you'd want to know what went wrong in that list of tasks, but:

There's no except clause in the outer try so errors in tasks appear to cause us to jump to finally
In the finally, we actually await the task (here) which causes the exception to be emitted, but there's no bare except to log (or even handle) the issue.
As a result, I believe the cancel isn't sent to every task. This may not technically be a bug, but it's certainly not ideal.

If one of the tasks error, isn't this method a logical place to catch and log it?

My conclusion is that the correct place to fix this is in the try statement in the finally. I believe this block should include a bare except that logs and (perhaps counter-intuitively) passes.
This ensures that the cancel attempt is made for all of the tasks. We probably should accumulate any exceptions in a list and raise that list (as a new exception) when we're done cancelling.
I originally wanted to handle exceptions earlier (e.g. on the outer try), but I believe this will create a race condition. Assume a task fails and puts us in the except block. If a second tasks fails in the finally, we're back at square one... might as well handle all of them here.

To ensure all of the cancels are attempted ASAP, I suspect the block should look more like this:

        for task in tasks:
            task.cancel()
        failures = []
        for task in tasks:
            try:
                await task
            except asyncio.CancelledError:
                pass
            except e:
                # not sure what logger to get/use here.  maybe the caller of `await_many_dispatch`
                # actually does the logging of the exception raised below.
                logging.error(str(e))
                failures.append(e)
       if failures:
           raise MultipleExceptions(failures)

ambsw-technology · 2020-04-22T03:09:46Z

P.S. this is the caller. There isn't a logger at this level either, but at least a message here is put into the context of the AsyncConsumer. In theory, await_many_dispatch could be used elsewhere though Github's usage tracking only shows the two in this file on lines 58 and 62.

raj2121990 · 2020-07-08T04:11:41Z

Is there any progress on this issue. I am also facing this. Unable to see any sort of logs.

Wrufesh · 2020-07-09T20:58:09Z

Same here

backend:

WebSocket HANDSHAKING /webrtc/ [127.0.0.1:38882]
WebSocket DISCONNECT /webrtc/ [127.0.0.1:38882]
WebSocket HANDSHAKING /webrtc/ [127.0.0.1:39086]
WebSocket DISCONNECT /webrtc/ [127.0.0.1:39086]
WebSocket HANDSHAKING /webrtc/ [127.0.0.1:39114]
WebSocket DISCONNECT /webrtc/ [127.0.0.1:39114]
WebSocket HANDSHAKING /webrtc/ [127.0.0.1:39208]
WebSocket DISCONNECT /webrtc/ [127.0.0.1:39208]

frontend:

WebSocket connection to 'ws://localhost:8000/webrtc/' failed: Error during WebSocket handshake: Unexpected response code: 500

eugenzor · 2020-07-27T22:01:44Z

I also had this issue, but finally, I figured out that I simply forgot to set

await self.accept()

Doesn't play nice with django channels Silences exceptions in consumers See: django/channels#1340 django/channels#1193 django-commons/django-debug-toolbar#1300

eriktelepovsky · 2021-06-22T17:01:41Z

Something else that worked for me is to set the root logger explicitly in settings:
LOGGING = {
        ..
        "loggers": {
            "": { # Root logger
                "handlers": ["console"],
                "level": "INFO",
            },
           ..
      ..
}

Thank you @Pithikos. This actually helped me to show the traceback in the console. Is there a way to find out which logger handled the exception?

carltongibson mentioned this issue Dec 2, 2019

Not able to see traceback exceptions on console with channels running #1381

Closed

carltongibson self-assigned this Dec 2, 2019

apinsard mentioned this issue Mar 4, 2020

Ignored "No such file or directory" in consumers #1422

Closed

dimadk24 pushed a commit to dimadk24/english-fight that referenced this issue Feb 15, 2021

chore: remove debug toolbar

1e24ee5

Doesn't play nice with django channels Silences exceptions in consumers See: django/channels#1340 django/channels#1193 django-commons/django-debug-toolbar#1300

dimadk24 pushed a commit to dimadk24/english-fight that referenced this issue Feb 16, 2021

chore: remove debug toolbar

2114f69

Doesn't play nice with django channels Silences exceptions in consumers See: django/channels#1340 django/channels#1193 django-commons/django-debug-toolbar#1300

dimadk24 pushed a commit to dimadk24/english-fight that referenced this issue Mar 7, 2021

chore: remove debug toolbar

9505ff0

Doesn't play nice with django channels Silences exceptions in consumers See: django/channels#1340 django/channels#1193 django-commons/django-debug-toolbar#1300

dimadk24 pushed a commit to dimadk24/english-fight that referenced this issue Mar 11, 2021

chore: remove debug toolbar

8160da7

Doesn't play nice with django channels Silences exceptions in consumers See: django/channels#1340 django/channels#1193 django-commons/django-debug-toolbar#1300

carltongibson closed this as completed Jun 22, 2021

django locked and limited conversation to collaborators Jun 22, 2021

Uh oh!

This issue was moved to a discussion.

Cant see traceback #1340

Cant see traceback #1340

Comments

LukenAgile42 commented Aug 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Ivo-Donchev commented Oct 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ambsw-technology commented Oct 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

carltongibson commented Oct 23, 2019

Uh oh!

ambsw-technology commented Oct 27, 2019

Uh oh!

Ivo-Donchev commented Nov 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ambsw-technology commented Nov 1, 2019

Uh oh!

carltongibson commented Nov 5, 2019

Uh oh!

Pithikos commented Feb 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LukenAgile42 commented Feb 12, 2020

Uh oh!

Pithikos commented Feb 12, 2020

Uh oh!

Ivo-Donchev commented Feb 12, 2020

Uh oh!

LukenAgile42 commented Feb 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LukenAgile42 commented Feb 12, 2020

Uh oh!

carltongibson commented Feb 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LukenAgile42 commented Feb 12, 2020

Uh oh!

Ivo-Donchev commented Feb 13, 2020

Uh oh!

ivanff commented Apr 14, 2020

Uh oh!

ambsw-technology commented Apr 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ambsw-technology commented Apr 22, 2020

Uh oh!

raj2121990 commented Jul 8, 2020

Uh oh!

Wrufesh commented Jul 9, 2020

Uh oh!

eugenzor commented Jul 27, 2020

Uh oh!

eriktelepovsky commented Jun 22, 2021

Uh oh!

This issue was moved to a discussion.

LukenAgile42 commented Aug 22, 2019 •

edited

Loading

Ivo-Donchev commented Oct 9, 2019 •

edited

Loading

ambsw-technology commented Oct 22, 2019 •

edited

Loading

Ivo-Donchev commented Nov 1, 2019 •

edited

Loading

Pithikos commented Feb 12, 2020 •

edited

Loading

LukenAgile42 commented Feb 12, 2020 •

edited

Loading

carltongibson commented Feb 12, 2020 •

edited

Loading

ambsw-technology commented Apr 22, 2020 •

edited

Loading