将Pandas GroupBy转换为DataFrame

时间:2018-05-08 14:01:28

标签: pandas

我有以下数据框table_1

     Sample  method       value
3   sample1  method_0     1
3   sample1  method_1     2
3   sample1  method_2     3
3   sample1  method_3     4
3   sample2  method_0     5
3   sample2  method_1     6
3   sample2  method_2     7
3   sample2  method_3     8


grouped = table_1.groupby('method')

我想按'方法'进行分组,然后对每个组,将值分为'值'该组的列由另一个系列组成,其条目数与每个组中的条目数相同。我通过以下方式实现了这一目标:

table_2 = grouped.apply(lambda x: x['value'].div(series_of_two_elements))

但现在我想将table_2合并到table_1中的每个组中。当我尝试:

table_1['normalized'] = table_2

我明白了:

TypeError: 'DataFrameGroupBy' object does not support item assignment

如何将table_1转换回DataFrame,以便为每个组分配这些新的规范化值?我可以使用df.transform的lambda表达式吗?

1 个答案:

答案 0 :(得分:2)

我认为需要GroupBy.transformSeries为numpy数组添加.values以避免对齐:

series_of_two_elements = pd.Series([1,2])

grouped = table_1.groupby('method')
table_2 = grouped['value'].transform(lambda x: x.div(series_of_two_elements.values))
table_1['normalized'] = table_2

print (table_1)
    Sample    method  value  normalized
3  sample1  method_0      1         1.0
3  sample1  method_1      2         2.0
3  sample1  method_2      3         3.0
3  sample1  method_3      4         4.0
3  sample2  method_0      5         2.5
3  sample2  method_1      6         3.0
3  sample2  method_2      7         3.5
3  sample2  method_3      8         4.0

另一种可能的解决方案是创建MultiIndex第二级cumcount,然后使用div第二级Series名为series_of_two_elements必须同每个组的索引值如second level):

series_of_two_elements = pd.Series([1,2])

table_1 = table_1.set_index(['method', table_1.groupby('method').cumcount()])
table_1['normalized'] = table_1['value'].div(series_of_two_elements, level=1)
print (table_1)
             Sample  value  normalized
method                                
method_0 0  sample1      1         1.0
method_1 0  sample1      2         2.0
method_2 0  sample1      3         3.0
method_3 0  sample1      4         4.0
method_0 1  sample2      5         2.5
method_1 1  sample2      6         3.0
method_2 1  sample2      7         3.5
method_3 1  sample2      8         4.0