我想将两列相乘,但前提是它们属于特定类。
我尝试根据如下所示的条件乘以列:
import pandas as pd
import numpy as np
d = {'Values':[1,1,1],'Class':[0,1,0],'Weights':[0.8,0.9,0.7]}
dataset = pd.DataFrame(data = d)
print(dataset)
(dataset[dataset['Class']==1])['Values'] = (dataset[dataset['Class']==1])['Values']*dataset['Weights']
print(dataset)
但这不会更改数据集。
然后我尝试了这个:
d = {'Values':[1,1,1],'Class':[0,1,0],'Weights':[0.8,0.9,0.7]}
dataset = pd.DataFrame(data = d)
print(dataset)
dataset['Weights'] = dataset['Weights']*dataset['Class']
replace_weights = {0:1}
dataset['Weights'] = dataset['Weights'].replace(replace_weights)
dataset['Values'] = dataset['Values']*dataset['Weights']
print(dataset)
这给了我预期的结果,但是我想知道是否有一种更简单或更优雅的方法?
我的输入数据框为:
Values Class Weights
0 1 0 0.8
1 1 1 0.9
2 1 0 0.7
,输出数据帧为:
Values Class Weights
0 1.0 0 1.0
1 0.9 1 0.9
2 1.0 0 1.0
答案 0 :(得分:1)
在Pandas中,要更改DataFrame切片的值时必须使用loc函数。否则,您的代码是正确的。
返回您的代码:
import pandas as pd
import numpy as np
d = {'Values':[1,1,1],'Class':[0,1,0],'Weights':[0.8,0.9,0.7]}
dataset = pd.DataFrame(data = d)
print(dataset)
Class Values Weights
0 0 1 0.8
1 1 1 0.9
2 0 1 0.7
dataset.loc[dataset['Class']==1, 'Values'] = dataset[dataset['Class']==1]['Values']*dataset['Weights']
print(dataset)
Class Values Weights
0 0 1.0 0.8
1 1 0.9 0.9
2 0 1.0 0.7