Question

我正在尝试透视Python中Pandas数据框中的以下示例数据类型。我遇到了其他一些stackoverflow答案，它们讨论了如何进行数据透视：pivot_table No numeric types to aggregate

但是，当我使用pivot_table()时，我可以透视数据。但是当我使用set_index()和unstack()时，出现以下错误：

AttributeError：'NoneType'对象没有属性'unstack'

样本数据：

id  responseTime    label   answers
ABC 2018-06-24  Category_1  [3]
ABC 2018-06-24  Category_2  [10]
ABC 2018-06-24  Category_3  [10]
DEF 2018-06-25  Category_1  [7]
DEF 2018-06-25  Category_8  [10]
GHI 2018-06-28  Category_3  [7]

所需的输出：

id  responseTime    category_1  category_2 category_3 category_8
ABC  2018-06-24           [3]     [10]         [10]       NULL
DEF  2018-06-25           [7]     NULL         NULL       [10]
GHI  2018-06-28           NULL    NULL         [7]        NULL

这有效：

 df=pdDF.pivot_table(index=['items_id','responseTime'], columns='label', values='answers', aggfunc='first')

这不起作用：

pdDF.set_index(['items_id','responseTime','label'], append=True, inplace=True).unstack('label')

我还使用pdDF[pdDF.isnull().any(axis=1)]来确保答案栏中没有空数据。我还使用了append=False，但发生了同样的错误。

从其他线程来看，set_index()和unstack()比pivot_table()更有效率。我也不想使用pivot_table()，因为它需要聚合功能并且我的答案列中不包含数字数据。我不想使用默认（mean()），所以最终使用了first()。关于为什么一种方法可行而另一种无效的任何见解？

Answer 1

AttributeError：'NoneType'对象没有属性'unstack'

在inplace = True中使用set_index时，它会修改数据框。它不返回任何内容(None)。因此，您不能在unstack对象上使用None。

inplace：布尔值，默认为False

就地修改DataFrame（不要创建新对象）

使用：

df1 = pdDF.set_index(['items_id','responseTime','label']).unstack('label')    
print(df1)

# Output:

id  responseTime    category_1  category_2 category_3 category_8
ABC  2018-06-24           [3]     [10]         [10]       NULL
DEF  2018-06-25           [7]     NULL         NULL       [10]
GHI  2018-06-28           NULL    NULL         [7]        NULL

Python pandas数据框数据透视仅适用于ivot_table（），而不适用于set_index（）和unstack（）

1 个答案: