-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DEPR: DataFrame.combine propagates nan values #10734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
you can simply do this:
or
maybe explain what the issue you are seeing. Essentially you are doing this:
so in this example is IS working as adverisied. The |
@jreback you are fully correct that those ( But, still, it is not really clear what the docs of
What does 'do not propagate NaN values' mean? What if the result of the function at a certain place in the frame is NaN, what should it do? Not propagate? (but this is what it does) Default to the value in In any case, docstring can use an update, as it does not seem correct. |
yeh doc-string prob not clear. Maybe we should just deprecate this. |
+1 to deprecate. This is pretty confusing. It also has no explanation what |
I want to depricate this. how to do so? never contributed to pandas before! |
look at how |
Repeating from #13970 Not really sure we should actually deprecate |
|
I agree with @sinhrks here. To be honest we have seen almost no reports of Let's deprecate. I suppose if we have a big uprising in the community my opinion could change. |
Example usage: combining two dataframes to take the maximum of both frames at each location:
Is there another easy way to achieve this? I am the first to say that we actually have too many methods (that's why we deprecated eg the really useless and unpythonic I fully agree that it is probably not used much, and I am also not certain it's worth keeping. But just want to point out that the bad docstring is not a reason to deprecate it. |
@jorisvandenbossche not unreasonable, though your example have never been requested :< In this case
again a dedicated method for a feature like this may not be entirely useful. Maybe can find some other uses / integrations which make sense. ok so let's discuss some more, @sinhrks can take it out from the PR (and let other bug fixes in). In particular from your example, having to 'manually' specify the |
After
What I also care is its impl. I think following points cannot be explained for users in clear way:
|
can we discuss it for 0.20, @jorisvandenbossche ? |
thoughts on what to do about this? |
FWIW, this confused me for a while today. As far as I can tell, the docs are misleading at best (maybe even just plain wrong?). Just like @jorisvandenbossche above, I was attempting to compute the element-wise max values between two dataframes (with some missing values). I was surprised to see this:
Am I missing something? |
@jreback @sinhrks @jorisvandenbossche Is this still something one could/should work on? I bumped into it because it's still part of the Reading the arguments here I'd be strongly in favour of deprecating it, but since the conversation is some years old, I'm not sure if all of it is still relevant |
I don't think a decision was made about what to do. I don't have a strong opinion, but I'm removing it from the 1.0 milestone. |
Since there hasn't been much consensus around deprecation and we do have decent test coverage for this function (and I do see the utility of this function), I think it might just be worth fixing this bug and documenting better. |
The documentation for DataFrame.combine claims that the method "do[es] not propagate NaN values, so if for a (column, time) one frame is missing a value, it will default to the other frame’s value". However, this does not seem to correspond to the actual behaviour of DataFrame.combine.
Sample code:
In many cases (as in this one), it may be remedied by using the fill_value parameter of DataFrame.combine. However, it might be a problem when there is no acceptable neutral element for the given combining function.
The text was updated successfully, but these errors were encountered: