我有一个带有列注释的数据框,我使用正则表达式删除数字。我只想计算使用此模式更改的行数。即要了解str.replace操作了多少行。
df['Comments']=df['Comments'].str.replace('\d+', '')
输出应类似于-
Operated on 10 rows
答案 0 :(得分:0)
看看这是否有帮助
import re
op_regex = re.compile("\d+")
df['op_count'] = df['comment'].apply(lambda x :len(op_regex.findall(x)))
print(f"Operation on {len(df[df['op_count'] > 0])} rows")
使用findall返回匹配字符串列表。
答案 1 :(得分:0)
re.subn()方法返回执行的替换次数和新字符串。
示例:text.txt包含以下内容。
No coments in the line 245
you can make colmments in line 200 and 300
Creating a list of lists with regular expressions in python ...Oct 28, 2018
re.sub on lists - python
示例代码:
count = 0
for line in open('text.txt'):
if (re.subn(r'\d+',"", line)[1]) > 0:
count+=1
print("operated on {} rows".format(count))
对于熊猫:
data['comments'] = pd.DataFrame(open('text.txt', "r"))
count = 0
for line in data['comments']:
if (re.subn(r'\d+',"", line)[1]) > 0:
count+=1
print("operated on {} rows".format(count))
输出:
operated on 3 rows