Skip to content

BUG: aggregate function on transposed resampled data frame raises ValueError #46904

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
provinzio opened this issue Apr 30, 2022 · 3 comments · Fixed by #47078
Closed
3 tasks done

BUG: aggregate function on transposed resampled data frame raises ValueError #46904

provinzio opened this issue Apr 30, 2022 · 3 comments · Fixed by #47078
Labels
Bug Regression Functionality that used to work in a prior pandas version Resample resample method
Milestone

Comments

@provinzio
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
DIM = 3
now = pd.Timestamp.now()
# Create data frame.
data = [[i + j for i in range(DIM)] for j in range(DIM)]
columns = [now + pd.DateOffset(i) for i in range(DIM)]
df = pd.DataFrame(data=data, columns=columns)
# Calculate mean/min/max/sum per month.
df.resample("M", axis=1).aggregate(["mean", "min", "max", "sum"])

Issue Description

I am trying to calculate the mean/min/max/sum per month of a dataframe containing timestamps as column indexes. Running the above MWE raises the following error.

Traceback (most recent call last):
  File "d:\current\29 XV DATEV Auswertungen\venv\lib\site-packages\pandas\core\generic.py", line 550, in _get_axis_number
    return cls._AXIS_TO_AXIS_NUMBER[axis]
KeyError: 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "d:\current\29 XV DATEV Auswertungen\venv\lib\site-packages\pandas\core\resample.py", line 347, in aggregate
    result = ResamplerWindowApply(self, func, args=args, kwargs=kwargs).agg()
  File "d:\current\29 XV DATEV Auswertungen\venv\lib\site-packages\pandas\core\apply.py", line 171, in agg
    return self.agg_list_like()
  File "d:\current\29 XV DATEV Auswertungen\venv\lib\site-packages\pandas\core\apply.py", line 367, in agg_list_like
    colg = obj._gotitem(col, ndim=1, subset=selected_obj.iloc[:, index])
  File "d:\current\29 XV DATEV Auswertungen\venv\lib\site-packages\pandas\core\resample.py", line 412, in _gotitem
    grouped = get_groupby(subset, by=None, grouper=grouper, axis=self.axis)
  File "d:\current\29 XV DATEV Auswertungen\venv\lib\site-packages\pandas\core\groupby\groupby.py", line 3846, in get_groupby
    return klass(
  File "d:\current\29 XV DATEV Auswertungen\venv\lib\site-packages\pandas\core\groupby\groupby.py", line 894, in __init__
    self.axis = obj._get_axis_number(axis)
  File "d:\current\29 XV DATEV Auswertungen\venv\lib\site-packages\pandas\core\generic.py", line 552, in _get_axis_number
    raise ValueError(f"No axis named {axis} for object type {cls.__name__}")
ValueError: No axis named 1 for object type Series

Expected Behavior

I'd like to create a dataframe where the mean/min/max/sum per month is calculated. Following to the documentation, I'd expect the code to run without errors.

Installed Versions

INSTALLED VERSIONS

commit : 4bfe3d0
python : 3.9.1.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
byteorder : little
LC_ALL : None
LANG : de_DE.UTF-8
LOCALE : de_DE.cp1252

pandas : 1.4.2
numpy : 1.22.0
pytz : 2021.3
dateutil : 2.8.2
pip : 22.0.4
setuptools : 49.2.1
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 3.0.2
lxml.etree : 4.8.0
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.3
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
brotli : None
fastparquet : None
fsspec : None
gcsfs : None
markupsafe : 2.0.1
matplotlib : 3.5.1
numba : None
numexpr : None
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
snappy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : None
zstandard : None

@provinzio provinzio added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 30, 2022
@provinzio
Copy link
Author

provinzio commented Apr 30, 2022

Transposing the DataFrame (having the timestamps in axis=0) works, but adds (for my case) the aggregated multi-indexed columns on the wrong dimension.
df.T.resample("M").aggregate(["mean", "min", "max", "sum"])

Edit:
The following line does what I want without errors
df.T.resample("M").aggregate(["mean", "min", "max", "sum"]).stack().T

@provinzio provinzio changed the title BUG: aggregate function on resample raises ValueError BUG: aggregate function on transposed resampled data frame raises ValueError Apr 30, 2022
@github-actions github-actions bot assigned ghost May 2, 2022
@ghost ghost removed their assignment May 3, 2022
simonjayhawkins added a commit to simonjayhawkins/pandas that referenced this issue May 4, 2022
@simonjayhawkins simonjayhawkins added the Resample resample method label May 4, 2022
@simonjayhawkins
Copy link
Member

Thanks @provinzio for the report.

In 1.2.5, the code sample raises a more helpful NotImplementedError: axis other than 0 is not supported. Will label as regression for now pending further investigation.

first bad commit: [8c62fbb] CLN: Remove _axis from apply (#40261)

cc @rhshadrach

@simonjayhawkins simonjayhawkins added Regression Functionality that used to work in a prior pandas version and removed Needs Triage Issue that has not been reviewed by a pandas team member labels May 4, 2022
@simonjayhawkins simonjayhawkins added this to the 1.4.3 milestone May 4, 2022
@rhshadrach
Copy link
Member

Thanks @provinzio; agreed with @simonjayhawkins, this should raise a better error message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Regression Functionality that used to work in a prior pandas version Resample resample method
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants