Skip to content

BUG: fix issue of boolean comparison on empty DataFrames (GH5808) #5810

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 4, 2014

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Dec 31, 2013

closes #5808

@jreback
Copy link
Contributor Author

jreback commented Dec 31, 2013

@jtratner I'd like you to look at this

I actually think my 'fix' is wrong, as it patches an arithmetic methods, but
shouldn't a bool/flex method be called here?

maybe this is the case of that you have here: https://github.com/pydata/pandas/blob/master/pandas/core/ops.py#L223

e.g. 'rand_' is called (I think because it is trying the reverse ops first?)

@jreback
Copy link
Contributor Author

jreback commented Jan 4, 2014

@jtratner if you have a chance...

@jtratner
Copy link
Contributor

jtratner commented Jan 4, 2014

@jreback sorry missed this I'll take a look now.

@jtratner
Copy link
Contributor

jtratner commented Jan 4, 2014

It's sort of self-documented (though not really well) which method does what in ops:

frame_special_funcs = dict(arith_method=_arith_method_FRAME,
                           radd_func=_radd_compat,
                           comp_method=_comp_method_FRAME,
                           bool_method=_arith_method_FRAME)

and frame flex funcs don't include the bool methods...

frame never had a separate bool method. Ignoring TypeErrors there feels wrong - wouldn't it silence other actually useful TypeErrrors?

Why couldn't we just check whether either or both are empty and then skip the transformation op if so?

The reason that I popped the r* bool methods is because they didn't work (there's an issue about that), and I felt it was better to have a NotImplemented rather than a weird error.

@jreback
Copy link
Contributor Author

jreback commented Jan 4, 2014

@jtratner

ok so for Series we have a _bool_method_SERIES.

for Frame I guess didn't exist before and is handled by arithmetic methods.

Biggest problem is I cannot find ANY tests here with boolean on frames.

I guess I can just handle the empties.

@jreback
Copy link
Contributor Author

jreback commented Jan 4, 2014

Problem is have all kinds of weirdness if these are treated as arithmetic operators

these should all raise, but an int and bool are ok. I think the dtypes have to be restricted for this to properly work.
IIRC we had some similar issues with Series

In [16]: DataFrame(True,index=[1],columns=['A']) & DataFrame(5,index=[1],columns=['A'])
Out[16]: 
   A
1  1

[1 rows x 1 columns]

In [17]: DataFrame(True,index=[1],columns=['A']) & DataFrame(5.0,index=[1],columns=['A'])
TypeError: ufunc 'bitwise_and' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

In [18]: DataFrame(1,index=[1],columns=['A']) & DataFrame(5.0,index=[1],columns=['A'])
TypeError: ufunc 'bitwise_and' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
In [19]: Series(1)&Series(2)
Out[19]: 
0    True
dtype: bool

In [20]: Series(1.)&Series(2)
TypeError: unsupported operand type(s) for &: 'float' and 'bool'

In [21]: Series(1)&Series(True)
Out[21]: 
0    True
dtype: bool

@jtratner
Copy link
Contributor

jtratner commented Jan 4, 2014

I agree, it's totally weird, but that's what we had previously. So do you want to just coerce it to bool first or are you going to restrict it to bool dtype only?

@jreback
Copy link
Contributor Author

jreback commented Jan 4, 2014

I think I need to add bool_method_FRAME otherwise too many ifs

@jtratner
Copy link
Contributor

jtratner commented Jan 4, 2014

that makes sense, that should simplify things a bit - though still need to mask it.

@jreback
Copy link
Contributor Author

jreback commented Jan 4, 2014

no perfect but preserves functionaility (and handles the empty case)

@jtratner
Copy link
Contributor

jtratner commented Jan 4, 2014

current format looks better, np.prod is relatively clear.

jreback added a commit that referenced this pull request Jan 4, 2014
BUG: fix issue of boolean comparison on empty DataFrames (GH5808)
@jreback jreback merged commit ee92c75 into pandas-dev:master Jan 4, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: boolean op with empty frames should not raise
2 participants