You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm having some problems with the .diff() argument, and first thought I was just being an idiot, but now I'm fairly confident I've isolated the bug.
Note, when I run this manually line-by-line it works fine, but I depend on this being inside a function (because I remove some columns before doing the differences and then reinstate them in a highly repetitive fashion).
For a long time I was on pandas 0.18.x and was using the following command fine:
data=data.groupby('country).diff().shift(-1)
But after upgrading to pandas 0.20.1, the behavior of diff seems to have changed, and now takes a periods argument, which is very useful to me! Now, the problem is I get thrown a error everytime I use it. The traceback looks like this:
Traceback (most recent call last):
File "/Users/myname/anaconda/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-7-5e1d634b8803>", line 1, in <module>
dat = feature_expand(data_everything, lags=2, lag_y=True, delta=True)
File "<ipython-input-3-184a59b406db>", line 126, in feature_expand
data_delta = data_delta.diff()
File "<string>", line 21, in diff
File "/Users/myname/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 612, in wrapper
*args, **kwargs)
File "/Users/myname/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 3481, in _aggregate_item_by_item
raise errors
TypeError: diff() got an unexpected keyword argument 'axis'
Following the traceback I find a wrapper function in groupby.py under, _GroupBy._make_wrapper().wrapper, which says it does some "trickery for aggregation functions that need an axis", and seems to add the axis keyword argument by itself. This has probably been useful behaviour previously, but now it breaks .diff() as it doesn't take an axis argument anymore.
I hope someone has time to help me and the community with this.
Yes! Here is an extract of the script im writing. I'm working with publicly available data (Varieties of Democrac)
In the attached .zip you'll find a python script with a single function. I load the data and then I try to process it. If you need any more information, please let me know :)
Could you change the dtype of _merge and merge2 columns from int8 to int32. It works for me. I suspect if this has to do with dtypes. Could you confirm?
I tried changing the dtype of all int8 columns to int32, and now it works! it's a little strange, but a workable solution for me. Is there anything I can do to help fix the bug from here?
Uh oh!
There was an error while loading. Please reload this page.
Code Sample, a copy-pastable example if possible
Problem description
Hi everyone! My first bug report :)
I'm having some problems with the .diff() argument, and first thought I was just being an idiot, but now I'm fairly confident I've isolated the bug.
Note, when I run this manually line-by-line it works fine, but I depend on this being inside a function (because I remove some columns before doing the differences and then reinstate them in a highly repetitive fashion).
For a long time I was on pandas 0.18.x and was using the following command fine:
But after upgrading to pandas 0.20.1, the behavior of diff seems to have changed, and now takes a periods argument, which is very useful to me! Now, the problem is I get thrown a error everytime I use it. The traceback looks like this:
Following the traceback I find a wrapper function in groupby.py under, _GroupBy._make_wrapper().wrapper, which says it does some "trickery for aggregation functions that need an axis", and seems to add the axis keyword argument by itself. This has probably been useful behaviour previously, but now it breaks .diff() as it doesn't take an axis argument anymore.
I hope someone has time to help me and the community with this.
Cheers :)
Expected Output
A dataframe of country-level first differences.
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.20.1
pytest: 2.9.2
pip: 9.0.1
setuptools: 35.0.2
Cython: 0.24.1
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 5.1.0
sphinx: 1.4.6
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.1.0
tables: 3.2.3.1
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.7.3
bs4: 4.5.3
html5lib: 0.9999999
sqlalchemy: 1.1.9
pymysql: 0.7.9.None
psycopg2: 2.7.1 (dt dec pq3 ext lo64)
jinja2: 2.8
s3fs: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: