根据类

时间:2019-05-22 08:08:59

标签: python python-3.x pandas-groupby

我想将两列相乘,但前提是它们属于特定类。

我尝试根据如下所示的条件乘以列:

import pandas as pd
import numpy as np

d = {'Values':[1,1,1],'Class':[0,1,0],'Weights':[0.8,0.9,0.7]}
dataset = pd.DataFrame(data = d)
print(dataset)

(dataset[dataset['Class']==1])['Values'] = (dataset[dataset['Class']==1])['Values']*dataset['Weights']

print(dataset)

但这不会更改数据集。

然后我尝试了这个:

d = {'Values':[1,1,1],'Class':[0,1,0],'Weights':[0.8,0.9,0.7]}
dataset = pd.DataFrame(data = d)
print(dataset)

dataset['Weights'] = dataset['Weights']*dataset['Class']
replace_weights = {0:1}
dataset['Weights'] = dataset['Weights'].replace(replace_weights)

dataset['Values'] = dataset['Values']*dataset['Weights']

print(dataset)

这给了我预期的结果,但是我想知道是否有一种更简单或更优雅的方法?

我的输入数据框为:

   Values  Class  Weights
0       1      0      0.8
1       1      1      0.9
2       1      0      0.7

,输出数据帧为:

   Values  Class  Weights
0     1.0      0      1.0
1     0.9      1      0.9
2     1.0      0      1.0

1 个答案:

答案 0 :(得分:1)

在Pandas中,要更改DataFrame切片的值时必须使用loc函数。否则,您的代码是正确的。

返回您的代码:

import pandas as pd
import numpy as np

d = {'Values':[1,1,1],'Class':[0,1,0],'Weights':[0.8,0.9,0.7]}
dataset = pd.DataFrame(data = d)

print(dataset)

Class  Values  Weights
    0      0       1      0.8
    1      1       1      0.9
    2      0       1      0.7

dataset.loc[dataset['Class']==1, 'Values'] = dataset[dataset['Class']==1]['Values']*dataset['Weights']

print(dataset)

   Class  Values  Weights
0      0     1.0      0.8
1      1     0.9      0.9
2      0     1.0      0.7