Question

我想在此数据框中添加一个重新调整的列：

I,Value
A,1
A,4
A,2
A,5
B,1
B,2
B,1

以便新列（我们称之为scale）跟随每个value组的I列上的函数。该函数只是每个组范围内的标准化：

lambda x: (x-min(x))/(max(x)-min(x))

到目前为止，我试过了：

d = df.groupby('I').apply(lambda x: (x-min(x))/(max(x)-min(x)))

收到以下TypeError：

TypeError: Could not operate array(['A'], dtype=object) with block values index 1 is out of bounds for axis 1 with size 1

Answer 1

如果您在代码中添加了“值”列，那么它将起作用：

In [69]:
df.groupby('I')['Value'].apply(lambda x: (x-min(x))/(max(x)-min(x)))

Out[69]:
0    0.00
1    0.75
2    0.25
3    1.00
4    0.00
5    1.00
6    0.00
dtype: float64

pandas方法版本如下，它产生相同的结果：

In [67]:
df['Normalised'] = df.groupby('I')['Value'].apply(lambda x: (x-x.min())/(x.max()-x.min()))
df

Out[67]:
   I  Value  Normalised
0  A      1        0.00
1  A      4        0.75
2  A      2        0.25
3  A      5        1.00
4  B      1        0.00
5  B      2        1.00
6  B      1        0.00

pandas - groupby和re-scale values

1 个答案: