如何更改按其他列的值过滤的pandas数据框列的值

时间:2019-04-24 16:04:09

标签: python r pandas dataframe filter

如何用Python编写以下R代码?

df = data.frame( a=c("apple", "banana", "orange", "apple"),
                 b=c(NA, 3, NA, 5)
                 c=c(2, 1, 0, NA)
                 d=c(1, NA, NA, 3) )

df[ df$a =="apple" & !is.na(df$b), "c"] = df[ df$a =="apple" & !is.na(df$b), "d"]

我尝试了以下操作,并收到TypeError:'Series'对象是可变的,因此不能将它们散列为错误

# Python code that receives an error
# df is Pandas DataFrame
df.loc[ (df.a=="apple") & ~df.b.isnull(), 'c'] = df.loc[ (df.a=="apple") & ~df.b.isnull(), 'd']

df['c'] = df.apply( lambda row: row['d'] if row['a']=="apple" & ~np.isnan(row['b']) else row['c'])

预期结果是df ['c']将具有[2,1,0,3]

1 个答案:

答案 0 :(得分:1)

在大熊猫中

df.loc[ (df.a =="apple") & (df.b.notnull()), "c"]=df.d