我在文件中有一行:
keyword = NORTH FACE keyword = GUESS keyword = DRESSES keyword = RALPH LAUREN
我的代码是:
keyword=re.findall(r'ke\w+ = \S+',s).
仅打印
NORTH GUESS DRESSES RALPH
但我需要正则表达式来处理和打印
NORTH FACE GUESS DRESSES RALPH LAUREN
答案 0 :(得分:3)
您的正则表达式仅消耗非空白字符(\S
)。这就是它遇到空格字符时停止匹配的原因。
将其更改为.*
。这将贪婪地匹配除换行符之外的所有字符(\n
)。
答案 1 :(得分:1)
试试这个:
re.findall(r'ke\w+ = .+$', s)
或者这个,只捕获等号之后的内容:
re.findall(r'ke\w+ = (.+)$', s)
答案 2 :(得分:1)
您需要keyword=re.findall(r'ke\w+ = \S.*',s)
代替keyword=re.findall(r'ke\w+ = \S+',s)
。
此外,不确定它是否符合您的要求,但按照您的示例,您也可以执行以下re.split
:
>>> s = 'keyword = NORTH FACE'
>>> re.split(' = ', s)
['keyword', 'NORTH FACE']
>>>
答案 3 :(得分:1)
lines = '''\
keyword = NORTH FACE
keyword = GUESS
keyword = DRESSES
keyword = RALPH LAUREN
'''.splitlines()
for line in lines:
line.partition(' = ')[2]
print
for line in lines:
print line.split(' = ')[1]
NORTH FACE
GUESS
DRESSES
RALPH LAUREN
NORTH FACE
GUESS
DRESSES
RALPH LAUREN
鉴于评论中的新信息并猜测数据文件格式(使用REAL示例更新问题!):
import re
data = '''\
keyword = NORTH FACE
score = 88466
normalizedKeyword = NORTH FACE
keyword = DRESSES
score = 79379
normalizedKeyword = DRESSES
'''
L = re.findall(r'keyword = (.*)\nscore = (.*)\n',data)
for i in L:
print ','.join(i)
NORTH FACE,88466
DRESSES,79379
答案 4 :(得分:0)
尝试:
>>> s="""
... keyword = NORTH FACE
... keyword = GUESS
... keyword = DRESSES
... keyword = RALPH LAUREN
... """
>>> re.findall(r'ke\w+ = .*',s)
['keyword = NORTH FACE', 'keyword = GUESS', 'keyword = DRESSES', 'keyword = RALPH LAUREN']
答案 5 :(得分:0)
不确定这是不是你要找的......
根据您的一条评论,如果您希望值配对的相邻行,但可能被非配对行包围,则必须执行一些操作。
扩展正则表达式:
(?:^|\n) [^\S\n]*
(?:keyword) [^\S\n]* = [^\S\n]* (\w(?:[^\S\n]*\w+)*) [^\S\n]* \n
\s*
(?:score) [^\S\n]* = [^\S\n]* (\w(?:[^\S\n]*\w+)*) [^\S\n]*
(?=\n|$)
答案 6 :(得分:0)
D = "keyword = RALPH LAUREN"
m = re.search('(?<== )(\w+\s*)*', D) # search for anything after '= '
m.group(0)
'RALPH LAUREN'
C = "keyword = GUESS"
m.group(0)
'GUESS'
答案 7 :(得分:-1)
msg=fh.read()
output=re.findall("keyword =(.*)",msg)
print (output)