Question

我是python的新手。这里有以下dataframe列

现在我有以下内容数组数据

cols = [100,156,160,162,200,256,262,2200,2600,2900,3600,4600]

现在，在这里我尝试用0替换它，如果它不在预测中。

所以结果会像

predict
 100
 200
 0
2200
0
3600

现在我尝试了

compare_df[~compare_df.isin(cols)] = 0

但是我收到此错误

TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value

有人可以帮助我吗？谢谢。

Answer 1

您必须使用Series而不是一列DataFrame，方法是使用列名选择loc并替换Predict的值：

compare_df.loc[~compare_df['Predict'].isin(cols), 'Predict'] = 0

如果删除具有列名的loc，则将所有行设置为带有掩码的0（如果存在）：

compare_df[~compare_df['Predict'].isin(cols)] = 0

如果将numpy.where与替代项一起使用，还请选择列Predict：

compare_df['Predict'] = np.where(compare_df['Predict'].isin(cols),compare_df['Predict'], 0)

但是在这里也可以工作：

compare_df['Predict'] = np.where(compare_df.isin(cols),compare_df, 0)

编辑：

要进行比较，需要在列和列表中使用相同的类型，例如数字或对象（显然是字符串）。

因此对于两个字符串值都是必需的：

cols = [str(x) for x in cols]
compare_df.loc[~compare_df['Predict'].isin(cols), 'Predict'] = 0

或对于两个数字：

compare_df['Predict'] = compare_df['Predict'].astype(float)
compare_df.loc[~compare_df['Predict'].isin(cols), 'Predict'] = 0

如果不起作用，则.astype(float)强制转换为浮点数：

compare_df['Predict'] = pd.to_numeric(compare_df['Predict'], errors='coerce')
compare_df.loc[~compare_df['Predict'].astype(float).isin(cols), 'Predict'] = 0

Answer 2

这是Series.where。它比np.where更好，因为只有当cols中不存在该值时，您才需要分配0。

new_df=df.where(df.isin(cols),0)
print(new_df)

如果有1个以上的列：

new_df=df.copy()
new_df['Predict']=df['Predict'].where(df['Predict'].isin(cols),0)
print(new_df)
   Predict
0      100
1      200
2        0
3     2200
4        0
5        0
6     3600

如果它们具有不同的类型：

new_df=df.copy()
new_df['Predict']=new_df['Predict'].astype(str) #this or the commented line depending on the type of cols and df ['Predict']
#new_df['predict']=new_df['Predict'].astype(int)
new_df['Predict']=df['Predict'].where(df['Predict'].isin(cols),0)
print(new_df)

如果找不到数据框列，则替换列值

2 个答案: