Question

以下是我的例子：

Referer: http://www.apple.com/
Referer: http://www.apple.com/diversity/
Referer: http://www.apple.com/
Referer: http://www.apple.com/apple-events/september-2015/

输出如下：

import pandas as pd

df = pd.DataFrame({
    'Student': ['A',  'B', 'B'],
    'Assessor': ['C',  'D', 'D'],
    'Score': [72, 19, 92]})
df = df.pivot_table(
    index='Student',
    columns='Assessor',
    values='Score',
    aggfunc=lambda x: x)
print(df)

我不确定为什么我得''[1,2]'作为输出。我希望有类似的东西：

Assessor    C       D
Student              
A          72     NaN
B         NaN  [1, 2]

以下是相关问题：

如果我用

替换我的数据框

Assessor    C       D
Student              
A          72     NaN
B         NaN     19
B         NaN     92

同一个轴的输出将是

df = pd.DataFrame({
    'Student': ['A',  'B', 'B'],
    'Assessor': ['C',  'D', 'D'],
    'Score': ['foo', 'bar', 'foo']})

任何想法。

Answer 1

如果特定单元格中原始DataFrame中有多行，

pivot_table会找到索引/列和聚合的唯一值。

索引/列通常是唯一的，所以如果你想以这种形式获取数据，你可以做一些像这样丑陋的事情，尽管你可能不想这样做。

In [21]: pivoted = pd.DataFrame(columns=df['Assessor'], index=df['Student'])

In [22]: for (assessor, score, student) in df.itertuples(index=False):
    ...:     pivoted.loc[student, assessor] = score

对于你的第二个问题，如果没有数字列要聚合，groupby通常会失败的原因，尽管它似乎是一个完全崩溃的错误。我在问题here中添加了一条注释。

熊猫 - 了解数据透视表的输出

1 个答案: