You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
importpandasaspdimportnumpyasnpfromStringIOimportStringIOprint"Pandas version %s\n\n"%pd.__version__data1="""idx,metric0,2.11,2.52,3"""data2="""idx,metric0,2.71,2.22,2.8"""df1=pd.read_csv(StringIO(data1))
df2=pd.read_csv(StringIO(data2))
concatenated=pd.concat([df1, df2], ignore_index=True)
merged=concatenated.groupby("idx").agg([np.mean, np.std])
printmergedprintmerged.sort('metric')
and its output:
$ python test.py
Pandas version 0.11.0
metric
mean std
idx
0 2.40 0.424264
1 2.35 0.212132
2 2.90 0.141421
Traceback (most recent call last):
File "test.py", line 22, in <module>
print merged.sort('metric')
File "/***/Python-2.7.3/lib/python2.7/site-packages/pandas/core/frame.py", line 3098, in sort
inplace=inplace)
File "/***/Python-2.7.3/lib/python2.7/site-packages/pandas/core/frame.py", line 3153, in sort_index
% str(by))
ValueError: Cannot sort by duplicate column metric
The problem here is not that there is a duplicate column metric as stated by the error message. The problem is that there are still two sub-levels. The solution in this case is to use
merged.sort([('metric', 'mean')])
for sorting by the mean of the metric. It took myself quite a while to figure this out. First of all, the error message should be more clear in this case. Then, maybe I was too stupid, but I could not find the solution in the docs, but within a thread on StackOverflow. Looks like the error message above is the result of an over-generalized condition around https://github.com/pydata/pandas/blob/v0.12.0rc1/pandas/core/frame.py#L3269
The text was updated successfully, but these errors were encountered:
related #739
Have a look at this example:
and its output:
The problem here is not that there is a duplicate column
metric
as stated by the error message. The problem is that there are still two sub-levels. The solution in this case is to usefor sorting by the mean of the metric. It took myself quite a while to figure this out. First of all, the error message should be more clear in this case. Then, maybe I was too stupid, but I could not find the solution in the docs, but within a thread on StackOverflow. Looks like the error message above is the result of an over-generalized condition around https://github.com/pydata/pandas/blob/v0.12.0rc1/pandas/core/frame.py#L3269
The text was updated successfully, but these errors were encountered: