Skip to content

Heavily nested structures are very slow to proceed #3796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yedpodtrzitko opened this issue Aug 3, 2017 · 6 comments
Closed

Heavily nested structures are very slow to proceed #3796

yedpodtrzitko opened this issue Aug 3, 2017 · 6 comments

Comments

@yedpodtrzitko
Copy link
Contributor

yedpodtrzitko commented Aug 3, 2017

Hi,
the dictionary below takes more than 30s to be processed. If the codebase contains more of such structures, it takes painfully long time to check it.

For now we use #3789 as a workaround to skip such slow parts.

Cheers,
yed_

testing env:
MB Pro 2013
Python 3.6
Mypy 0.521

testing file:

input = {
    'body': {
        'token': {
            'type': 'string',
            'required': False
        },
        'payloads': {
            'type': 'list',
            'required': True,
            'schema': {
                'type': 'dict',
                'schema': {
                    'message': {
                        'type': 'string',
                        'required': True
                    },
                    'notification': {
                        'type': 'dict',
                        'required': True,
                        'schema': {
                            'body': {
                                'type': 'dict',
                                'schema': {
                                    'type': {
                                        'type': 'string',
                                        'required': True
                                    },
                                    'data': {
                                        'type': 'dict',
                                        'required': True,
                                        'schema': {
                                            'bid': {
                                                'type': 'string',
                                                'required': False
                                            },
                                            'url': {
                                                'type': 'string',
                                                'required': False
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
@gvanrossum
Copy link
Member

Wow. Confirmed -- on my Mac it takes 21s to process that file.

@refi64
Copy link
Contributor

refi64 commented Aug 3, 2017

I wonder if this is somehow similar to:

https://spin.atomicobject.com/2016/04/26/swift-long-compile-time/

@refi64
Copy link
Contributor

refi64 commented Aug 3, 2017

(I mean algorithm-wise.)

@ilevkivskyi
Copy link
Member

It looks like we have two problems here. This code typechecks quickly:

def f(x: int) -> int:
    return x

y = f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(1))))))))))))))))))

while this takes around 41 seconds on my desktop:

@overload
def f(x: str) -> str: ...
@overload
def f(x: int) -> int: ...
def f(x):
    return x

y = f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(1))))))))))))))))))

Independently, this takes even more, around 1 minute 12 seconds:

T = TypeVar('T')
def f(x: T) -> T:
    return x

y = f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(1))))))))))))))))))

Taking into account that dict translates into an overloaded generic function, it looks like both problems combine in the original example.

@JukkaL
Copy link
Collaborator

JukkaL commented Aug 7, 2017

Yeah, there clearly are various scenarios in which type checking is exponential in time. This is basically a known problem, but I doubt anybody has done a careful analysis of potential sources of exponential behavior. A potential simple fix is to use caching for the inferred type of an expression. I think that the tuple (identity of an expression, type context) could be used as a cache key, as long as as the cache is cleared at certain points -- for example, after leaving a scope or after each statement. I'm not 100% sure though.

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Dec 7, 2020

The example in the original report is fixed by #9477

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants