Question

我的目标是对“得分”进行制图。通过＆＃39;标签＆＃39;，我不在乎＆＃34; date＆＃34;和＆＃34; Cusip＆＃34;。我想使用＆＃39; pivot＆＃39;重塑数据，以便每个Label都在一列中，我可以将其打包。

              date   Cusip    Label Score
663182  2015-07-31  00846UAG    AAA 138.15
663183  2015-07-31  00846UAH    AAA 171.93
663184  2015-07-31  00846UAJ    AAA 175.67
663185  2015-07-31  023767AA    BB  187.92
663186  2015-07-31  023770AA    BB  176.25

t.pivot(index=['date','Cusip'],columns='Label',values='Score')

错误显示：

NotImplementedError: > 1 ndim Categorical are not supported at this time

更多详情：

C:\Anaconda3\lib\site-packages\pandas\core\categorical.py in __init__(self, values, categories, ordered, name, fastpath, levels)
    285             try:
--> 286                 codes, categories = factorize(values, sort=True)
    287             except TypeError:

C:\Anaconda3\lib\site-packages\pandas\core\algorithms.py in factorize(values, sort, order, na_sentinel, size_hint)
    184     uniques = vec_klass()
--> 185     labels = table.get_labels(vals, uniques, 0, na_sentinel, True)
    186 

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_labels (pandas\hashtable.c:13921)()

ValueError: Buffer has wrong number of dimensions (expected 1, got 2)

Answer 1

您确实应该使用pivot_table，因为您的date列中有重复的条目。

pd.pivot_table(df, values='Score', index=['date', 'Cusip'], columns=['Label']).boxplot()

alt text

Answer 2

作为.pivot_table()的替代方法（可能会进行不必要的聚合），您可以

df.set_index(['date', 'Cusip','Label'])['Score'].unstack()

数据透视表错误：目前不支持1 ndim分类

2 个答案: