Question

我有一个这样的数据框：

lis = [['a','b','c'],
       ['17','10','6'],
       ['5','30','x'],
       ['78','50','2'],
       ['4','58','x']]
df = pd.DataFrame(lis[1:],columns=lis[0])

如何编写一个函数，如果'x'在列[c]中，则用[b]栏中的相应值覆盖该值。结果将是：

[['a','b','c'],
['17','10','6'],
['5','30','30'],
['78','50','2'],
['4','58','58']]

Answer 1

使用.loc和np.where

import numpy as np
df.c=np.where(df.c=='x',df.b,df.c)
df
Out[569]: 
    a   b   c
0  17  10   6
1   5  30  30
2  78  50   2
3   4  58  58

Answer 2

这应该可以解决问题

import numpy as np
df.c = np.where(df.c == 'x',df.b, df.c)

Answer 3

我不是pandas但是如果你想改变lis你可以这样做：

>>> [x if x[2] != "x" else [x[0], x[1], x[1]] for x in lis]
[['a','b','c'],
['17','10','6'],
['5','30','30'],
['78','50','2'],
['4','58','58']]

通过迭代覆盖Pandas DataFrame中的值？

3 个答案: