Question

我想根据条件操纵熊猫行。一个例子是，如果method中的字段是一个numpy NaN，我想用字符串'BBBB'覆盖type中的每个字段：

newcolumn = []
for index, row in results_DF.iterrows():
    newcolumn.append('BBBB' if row['type'] is np.nan else row['method'])
results_DF['method'] = pd.Series(newcolumn)

此实现看起来很难看。如何写得更好-更具功能性？

Answer 1

将DataFrame.loc与Series.isna创建的布尔amsk一起使用：

results_DF.loc[results_DF['type'].isna(), 'method'] = 'BBBB'

#oldier pandas versions
#results_DF.loc[results_DF['type'].isnull(), 'method'] = 'BBBB'

使用numpy.where的另一种解决方案：

results_DF['method'] = np.where(results_DF['type'].isna(), 'BBBB', results_DF['method'])

或@Jon Clements的解决方案，谢谢：

results_DF['method'] = results_DF.where(results_DF['type'].notnull(), 'BBBB')

示例：

results_DF = pd.DataFrame({'method': ['a','s','d'],
                           'type':[np.nan, np.nan, 4]})

print (results_DF)
  method  type
0      a   NaN
1      s   NaN
2      d   4.0

results_DF.loc[results_DF['type'].isna(), 'method'] = 'BBBB'
print (results_DF)
  method  type
0   BBBB   NaN
1   BBBB   NaN
2      d   4.0

Answer 2

尝试一下

mask=results_DF['type'].isnull()
results_DF.loc[mask]='BBBB'

根据条件处理行

2 个答案: