-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
Inconsistent kwargs argument 'color' passed to upstream matplotlib plot functions #31691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
thanks for posting @xuancong84 probably this could be clarified/better defined, but this is because color is defined per column when you are using pd.DataFrame({"v1": [3,6,2], "v2": [2,1,5], "v3": [4,5,6]}).plot(kind="bar", color=["r", "b", "y"]) If you would like to make your example work, then you should use pd.Series([3, 6, 2]).plot(kind="bar", color=["r", "b", "y"]) I think you will get the desired figure through the code above. |
Thanks @charlesdong1991 , does that mean if we use pd.DataFrame.plot.bar to plot, it actually calls pyplot.bar several times, once for each column? Thus, all kwargs have to be packed into a list? |
@xuancong84 indeed,
I am not sure what you mean by |
@charlesdong1991 , thanks for your info. I refer Since not all of these arguments will be unpacked before passing to matplotlib, to avoid this confusion, I would like to suggest that: for all those arguments that will be unpacked, add a prefix or suffix to distinguish, e.g., instead of |
thanks for your suggestion @xuancong84 i think renaming the argument/having new similar argument is an API change and might cause confusion to users, especially if this only works for I think the easiest way is to have a better docstring for @TomAugspurger might have better opinions on it? |
Thanks @charlesdong1991 ! Yup, revising the documentation for |
There are similar issues with import pandas as pd
import matplotlib
print(pd.__version__) # ---> 1.0.3
print(matplotlib.__version__) # ---> 3.1.3
# ok, line plot
pd.Series([5, 10, 20]).plot(color='r')
# ok, red dots
pd.Series([5, 10, 20]).plot(style='.', color='r')
# fails
pd.Series([5, 10, 20]).plot(style='o', color='r')
# the following only applies the first color (red)
# --> all three points are red
pd.Series([5, 10, 20]).plot(style='.', color=['r', 'g', 'b']) With # works
plt.plot([5, 10, 20], marker='o', color='r')
# fails
plt.plot([5, 10, 20], marker='o', color=['r', 'g', 'b']) |
Hi @xuancong84
Are you interested in submitting a PR? |
|
PR = Pull Request, see contributing to pandas |
Hi @MarcoGorelli, I would love to contribute to this issue. Please let me know if I may :) |
of course! |
@MarcoGorelli As This will be my first contribution to pandas repo, Could you please give some pointer on this issue, I would really appreciate that. |
@ankushduacodes for a start, read through the contributing guide linked above. After that, I think what needs to be done to close this issue is to reword the docstring for
This can probably be clarified, maybe by noting that if you only have a single column, then only the first colour in the list will be used |
In matplotlib.pyplot.bar , there is a keyword argument called 'color' which can control the color of all bars as well as each bar, e.g.,

However, in Pandas DataFrame.plot.bar, by passing a list into 'color', the color of all bars is controlled only by the 1st element in the list, i.e.,

Ironically, if we pass in a list of list into 'color', we can control the color of each bar, i.e.,

So my question is why the behavior of the 'color' argument different from that in matplotlib? Is this intended inconsistency?
The text was updated successfully, but these errors were encountered: