我正在努力使pandas: pivoting on rank的答案适应真实数据。运行提供的解决方案:
pd.pivot(df['id'], df.groupby('id'), df['loc'])
.rename_axis(None)
.rename_axis(None, axis=1)
我的数据导致错误
未实现非唯一索引的Index._join_level
查看堆栈跟踪,我发现问题源于
/data/qps/dm-conda/lib/python3.4/site-packages/pandas/core/reshape.py
in pivot(self, index, columns, values)
325 index = self.index
326 else:
--> 327 index = self[index]
328 indexed = Series(self[values].values,
329 index=MultiIndex.from_arrays([index, self[columns]]))
我只能直接复制:
df[df['serial']]
(其中df['serial']
相当于示例中的df['id']
)
为了理解/排除错误,我需要查看哪些内容?
以下是df
的索引:
In [36]:df.index
Out[36]:
Int64Index([209442, 279040, 203222, 27126, 250496, 458622, 9012, 595219,
289539, 82650,
...
864328, 719640, 416547, 410042, 862238, 723387, 410729, 117068,
413247, 412747],
dtype='int64', length=9366)
In [37]: df.index.is_unique
Out[37]: True
和df['serial']
是唯一索引:
In [39]: df['serial'].index
Out[39]:
Int64Index([209442, 279040, 203222, 27126, 250496, 458622, 9012, 595219,
289539, 82650,
...
864328, 719640, 416547, 410042, 862238, 723387, 410729, 117068,
413247, 412747],
dtype='int64', length=9366)
In [40]: df['serial'].index.is_unique
Out[40]: True
索引似乎是唯一的,那导致错误的是什么?