Skip to content

BUG: pd.tseries.frequencies.to_offset() does not really provide a DateOffset object #34381

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
meteoDaniel opened this issue May 26, 2020 · 4 comments · Fixed by #51775
Closed
2 of 3 tasks
Labels
Deprecate Functionality to remove in pandas Frequency DateOffsets

Comments

@meteoDaniel
Copy link

meteoDaniel commented May 26, 2020

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

import pandas as pd
from datetime import datetime
data=pd.DataFrame([[430., 630.],
                         [540., 660.],
                         [610., 720.]],
                        columns=[1, 2],
                        index=pd.date_range(datetime(2019, 1, 1, 12),
                                            datetime(2019, 1, 1, 12, 30),
                                            freq='15min'))
pd.tseries.frequencies.to_offset(pd.infer_freq(data.index))
>> <15 * Minutes>
pd.tseries.frequencies.to_offset(pd.infer_freq(data.index)) / 2
>> <450 * Secondes>
pd.tseries.offsets.DateOffset(minutes=15)
>> <DateOffset: minutes=15>
pd.tseries.offsets.DateOffset(minutes=15) / 2
>> TypeError: unsupported operand type(s) for /: 'DateOffset' and 'int'

Problem description

The function pd.tseries.frequencies.to_offset does not really provide a DateOffsetobject. As I understand the code right there are several types of DateOffsets e.g. Minutes, Seconds that are not compareable with the DateOffset object because they are having different properties.

Expected Output

One Unified pd.DateOffset Object with same properties for all inherit Objects like Minutes and Seconds

Output of pd.show_versions()

[INSTALLED VERSIONS

commit : None
python : 3.6.9.final.0
python-bits : 64
OS : Linux
OS-release : 4.15.0-99-generic
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : de_DE.UTF-8
LOCALE : de_DE.UTF-8

pandas : 1.0.3
numpy : 1.16.0
pytz : 2018.9
dateutil : 2.8.1
pip : 20.0.2
setuptools : 44.0.0
Cython : 0.28.3
pytest : 4.3.0
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.2.1
html5lib : 0.999999999
pymysql : 0.9.3
psycopg2 : 2.8.4 (dt dec pq3 ext lo64)
jinja2 : 2.10
IPython : 7.11.1
pandas_datareader: None
bs4 : 4.6.0
bottleneck : 1.3.0
fastparquet : None
gcsfs : None
lxml.etree : 4.2.1
matplotlib : 2.2.2
numexpr : 2.6.4
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.13.0
pytables : None
pytest : 4.3.0
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.16
tables : 3.4.2
tabulate : 0.8.5
xarray : 0.10.9
xlrd : 1.1.0
xlwt : None
xlsxwriter : None
numba : 0.42.0
]

@meteoDaniel meteoDaniel added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels May 26, 2020
@jbrockmendel
Copy link
Member

One Unified pd.DateOffset Object with same properties for all inherit Objects like Minutes and Seconds

Why would you expect this? the DateOffset class is way less efficient than the Minute or Second class.

@meteoDaniel
Copy link
Author

meteoDaniel commented Jun 2, 2020

@jbrockmendel if it is less efficient why it was not depricate? I was very confused by the different behaviour of DateOffset objects. I think there should not be several DateOffset objects available.

To answer your question: If I am calling the function to_offset I would expect to receive a DateOffset object but I receive Seconds, Minutes or other 'Time' objects.
This is a part of the docstring for to_offset:

def to_offset(freq) -> Optional[DateOffset]:
    """
    Return DateOffset object from string or tuple representation
    or datetime.timedelta object.
    Parameters
    ----------
    freq : str, tuple, datetime.timedelta, DateOffset or None
    Returns
    -------
    DateOffset
        None if freq is None.

@jbrockmendel
Copy link
Member

if it is less efficient why it was not depricate?

I'd be fine with this if you'd like to make a PR.

@meteoDaniel
Copy link
Author

meteoDaniel commented Jun 2, 2020

Next week I can have a closer look to pandas source code to prepare a PR. I have learned that it is a huge effort to make it accurate for such a big project. I hope some other guys will join
the conversation.
Actually my idea is to return e.g. a minute object if DateOffset(hours=1,Minutes=20) is parsed like that because minute is the smallest given level. And to deprecate the origin DateOffset object. In the end working with minutes secondes or other objects makes no differences in mathematical operations and it still defines your Dateoffset. And for sure it should go along with Timedelta operations.

@meteoDaniel meteoDaniel reopened this Jun 2, 2020
@jbrockmendel jbrockmendel added API Design Frequency DateOffsets and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 5, 2020
@mroeschke mroeschke added the Refactor Internal refactoring of code label Jun 28, 2020
@mroeschke mroeschke added Deprecate Functionality to remove in pandas and removed Refactor Internal refactoring of code API Design labels Aug 7, 2021
@jbrockmendel jbrockmendel mentioned this issue Mar 3, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Frequency DateOffsets
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants