Skip to content

Spacy import breaks pandas output in jupyter #3208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
noise-field opened this issue Jan 29, 2019 · 3 comments · Fixed by #3213
Closed

Spacy import breaks pandas output in jupyter #3208

noise-field opened this issue Jan 29, 2019 · 3 comments · Fixed by #3213
Labels
install Installation issues third-party Third-party packages and services

Comments

@noise-field
Copy link

How to reproduce the behaviour

# In[1]:
import pandas as pd
import numpy as np

# In[2]:
df = pd.DataFrame(np.ones((2, 2)))

# In[3]:
df  # shows the dataframe

# In[4]:
import spacy

# In[5]:
df  # leads to an exception

Your Environment

  • spaCy version: 2.0.16
  • Platform: Linux-4.15.0-43-generic-x86_64-with-debian-buster-sid
  • Python version: 3.7.0
  • Models: en, en_core_web_lg

Notes

This only affects pandas 0.24.0 (0.23.4 works fine). Functions like .head() are also affected.
Exception details:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~/anaconda3/envs/fat_ml/lib/python3.7/site-packages/IPython/core/formatters.py in __call__(self, obj)
    343             method = get_real_method(obj, self.print_method)
    344             if method is not None:
--> 345                 return method()
    346             return None
    347         else:

~/anaconda3/envs/fat_ml/lib/python3.7/site-packages/pandas/core/frame.py in _repr_html_(self)
    647         # display HTML, so this check can be removed when support for
    648         # IPython 2.x is no longer needed.
--> 649         if console.in_qtconsole():
    650             # 'HTML output is disabled in QtConsole'
    651             return None

~/anaconda3/envs/fat_ml/lib/python3.7/site-packages/pandas/io/formats/console.py in in_qtconsole()
    121             ip.config.get('KernelApp', {}).get('parent_appname', "") or
    122             ip.config.get('IPKernelApp', {}).get('parent_appname', ""))
--> 123         if 'qtconsole' in front_end.lower():
    124             return True
    125     except NameError:

AttributeError: 'LazyConfigValue' object has no attribute 'lower'
@ines ines added install Installation issues third-party Third-party packages and services labels Jan 30, 2019
@ines
Copy link
Member

ines commented Jan 30, 2019

This is very mysterious. I had a look and it seems like this has also been reported in jupyter/notebook#4369 and fastai/fastai#1531. What's especially confusing is that the error occurs here and is a result of IPython's get_ipython() not returning what it's supposed to:

https://github.com/pandas-dev/pandas/blob/6c84005d36309b23fd3648db94f094a1d330963d/pandas/io/formats/console.py#L111-L127

spaCy itself doesn't interact with pandas at all, and it doesn't interact with IPython in the global scope either. So one possible explanation I can think of is that spaCy imports a dependency that does something like this? I'll try to reproduce the problem and see if I can track this down.

Edit 1: Okay, I can reproduce this. I also checked the diff for the latest pandas version and it shows that this function was introduced there in the latest version.

Edit 2: We do check whether we're in a Jupyter notebook in the global scope, which would be executed when you import spaCy.

Edit 3: Not accessing the config properties within get_ipython().config in the global scope does solve the problem. I don't fully understand what the underlying cause is, though. Maybe the answer is "just don't do this"? Maybe I should report this on the IPython tracker?

Here's a minimal reproducible example without spaCy:

# In[1]:
import pandas as pd
df = pd.DataFrame()
df

# In[2]:
cfg = get_ipython().config
cfg['IPKernelApp']['parent_appname']
df

We can issue a quick fix that changes the way we check for a notebook in displaCy, and also uses the new and actually correct way we already have on develop.

Edit 4: Here's a minimal reproducible example without spaCy or pandas:

# In[1]:
def test():
    ip = get_ipython()
    test = (ip.config.get('KernelApp', {}).get('parent_appname', "") or ip.config.get('IPKernelApp', {}).get('parent_appname', ""))
    test.lower()

# In[2]:
test()

# In[3]:
cfg = get_ipython().config
cfg['IPKernelApp']['parent_appname']

# In[4]:
test()

ines added a commit that referenced this issue Jan 30, 2019
Prevent interactions with other libraries (pandas) that also access get_ipython().config and its parameters
@ines
Copy link
Member

ines commented Jan 31, 2019

This will be resolved in the next update of pandas and we'll also be issuing a workaround for spaCy. If you're interested, see here for some background on the problem: pandas-dev/pandas#25036

ines added a commit that referenced this issue Jan 31, 2019
Resolves #3208.

Prevent interactions with other libraries (pandas) that also access `get_ipython().config` and its parameters. See #3208 for details. I don't fully understand why this happens, but in spaCy, we can at least make sure we avoid calling into this method.

<!--- Provide a general summary of your changes in the title. -->

## Description
<!--- Use this section to describe your changes. If your changes required
testing, include information about the testing environment and the tests you
ran. If your test fixes a bug reported in an issue, don't forget to include the
issue number. If your PR is still a work in progress, that's totally fine – just
include a note to let us know. -->

### Types of change
<!-- What type of change does your PR cover? Is it a bug fix, an enhancement
or new feature, or a change to the documentation? -->

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
@lock
Copy link

lock bot commented Mar 2, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Mar 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
install Installation issues third-party Third-party packages and services
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants