使用re.findall()
我设法在字符串中返回正则表达式的多个匹配项。但是我返回的对象是字符串中的匹配列表。这不是我想要的。
我想要的是用其他东西替换所有匹配。我尝试使用类似于在re.sub中使用的类似语法来执行此操作:
import json
import re
regex = re.compile('([a-zA-Z]\"[a-zA-Z])', re.S)
filepath = "C:\\Python27\\Customer Stuff\\Austin Tweets.txt"
f = open(filepath, 'r')
myfile = re.findall(regex, '([a-zA-Z]\%[a-zA-Z])', f.read())
print myfile
但是,这会产生以下错误:
Traceback (most recent call last):
File "C:/Python27/Customer Stuff/Austin's Script.py", line 9, in <module>
myfile = re.findall(regex, '([a-zA-Z]\%[a-zA-Z])', f.read())
File "C:\Python27\lib\re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
File "C:\Python27\lib\re.py", line 229, in _compile
bypass_cache = flags & DEBUG
TypeError: unsupported operand type(s) for &: 'str' and 'int'
任何人都可以在最后一点语法中帮助我,我需要用原始Python对象中的其他内容替换所有匹配项吗?
修改
根据收到的评论和答案,这里是我试图将一个正则表达式与另一个正则表达式:
import json
import re
regex = re.compile('([a-zA-Z]\"[a-zA-Z])', re.S)
regex2 = re.compile('([a-zA-Z]%[a-zA-Z])', re.S)
filepath = "C:\\Python27\\Customer Stuff\\Austin Tweets.txt"
f = open(filepath, 'r')
myfile = f.read()
myfile2 = re.sub(regex, regex2, myfile)
print myfile
现在产生以下错误:
Traceback (most recent call last):
File "C:/Python27/Customer Stuff/Austin's Script.py", line 11, in <module>
myfile2 = re.sub(regex, regex2, myfile)
File "C:\Python27\lib\re.py", line 151, in sub
return _compile(pattern, flags).sub(repl, string, count)
File "C:\Python27\lib\re.py", line 273, in _subx
template = _compile_repl(template, pattern)
File "C:\Python27\lib\re.py", line 258, in _compile_repl
p = sre_parse.parse_template(repl, pattern)
File "C:\Python27\lib\sre_parse.py", line 706, in parse_template
s = Tokenizer(source)
File "C:\Python27\lib\sre_parse.py", line 181, in __init__
self.__next()
File "C:\Python27\lib\sre_parse.py", line 183, in __next
if self.index >= len(self.string):
TypeError: object of type '_sre.SRE_Pattern' has no len()
答案 0 :(得分:11)
import re
regex = re.compile('([a-zA-Z]\"[a-zA-Z])', re.S)
myfile = 'foo"s bar'
myfile2 = regex.sub(lambda m: m.group().replace('"',"%",1), myfile)
print(myfile2)
答案 1 :(得分:2)
答案 2 :(得分:1)
我发现使用函数来执行此类替换而不是lambda更清楚。它可以在替换文本之前轻松对匹配的文本执行任意数量的转换:
import re
def replace_double_quote(match):
text = match.group()
return text.replace('"', '%')
regex = re.compile('([a-zA-Z]\"[a-zA-Z])')
myfile = 'foo"s bar and bar"s foo'
regex.sub(replace_double_quote, myfile)
返回foo%s bar and bar%s foo
。请注意,它会替换所有匹配项。