与string.replace(s, old, new[, maxreplace])
相比,pandas.DataFrame.replace()
函数似乎缺少一个参数,该参数限制了您希望替换的出现次数。
例如:
df = pd.DataFrame({'col1': ['horse', 'dog', 'snake', 'dog'], 'col2': ['dog', 'snake', 'dog', 'cow']})
$ python run.py
col1 col2
0 horse dog
1 dog snake
2 snake dog
3 dog cow
我想用BEAR(在所有列和行中)替换df中出现的 n = 3个 dog 。
所需的输出:
$ python run.py
col1 col2
0 horse BEAR
1 BEAR snake
2 snake dog
3 BEAR cow
实现此目标的最佳方法是什么?我想避免遍历df的每个单元格。
答案 0 :(得分:4)
一种方法是先拆栈,然后遮罩然后拆栈:
<!DOCTYPE html>
<html xmlns:th="http://www.w3.org/1999/xhtml">
<head>
<meta charset="UTF-8">
<link rel="stylesheet" th:href="@{/css/style.css}"/>
</head>
<body>
<div class="button">
<a href="restricted">Click here to Login</a>
</div>
<!--<div class="box">
<form method="GET" action="/restricted">
<input type="submit" value="Click here to login">
</form>
</div>-->
</body>
</html>
使用numpy的另一种选择:
n = 3
s = df.unstack()
c = s.eq("dog").groupby(s).cumsum()
s.mask(c<=n,s.replace("dog","BEAR")).unstack(0)
arr = np.cumsum(np.ravel(df.eq("dog").to_numpy(),'F')).reshape(df.shape,order='F')
df[:] = np.where(arr<=3,df.replace("dog","BEAR"),df) #changes the array inplace
print(df)
答案 1 :(得分:3)
将DataFrame.mask
和DataFrame.fillna
与参数limit=3
一起使用,该参数仅替换前三个NaN
:
df.mask(df.eq('dog')).unstack().fillna('BEAR', limit=3).fillna('dog').unstack(level=0)
col1 col2
0 horse BEAR
1 BEAR snake
2 snake dog
3 BEAR cow
或更复杂的带有参数的函数:
def replace_n(data, to_replace, new, n):
data = data.mask(data.eq(to_replace))
data = data.unstack().fillna(new, limit=n)
data = data.fillna(to_replace).unstack(level=0)
return data
replace_n(df, 'dog', 'BEAR', n=3)
col1 col2
0 horse BEAR
1 BEAR snake
2 snake dog
3 BEAR cow
答案 2 :(得分:0)
您可以使用此循环:
import pandas as pd
d = {'col1': ['horse', 'dog', 'snake', 'dog'], 'col2': ['dog', 'snake', 'dog', 'cow']}
n = 3
for k in d.keys():
for i,s in enumerate(d[k]):
if s == 'dog' and n > 0:
d[k].pop(i)
d[k].insert(i,'BEAR')
n -= 1
df = pd.DataFrame(d)
print(df)
输出:
col1 col2
0 horse BEAR
1 BEAR snake
2 snake dog
3 BEAR cow