我有一个文件data.txt
:
<tag,1>moon sunlightcream potato</tag>
<tag,2>dishes light jellybeans</tag>
和python文件match.py
:
for LINE in open("data.txt"):
STRING = "light"
if STRING in LINE:
print (LINE)
输出结果为:
<tag,1>moon sunlightcream potato</tag>
<tag,2>dishes light jellybeans</tag>
我只想:
dishes light jellybeans
我该怎么做?
更大的背景是:
TAG = QUERY.split("&")[1]
LIST = []
for LINE in open(DATA):
STRING = "<tag,"
if STRING in LINE:
if TAG in LINE:
print LINE
所以我不能这样做" light "
!因为"light"
是一个变量。所以我似乎不能这样做:" light "
正则表达式选项是:
import re
def sub_list():
TAG = "light"
p_number = re.compile(r'<tag,.*,' + TAG + ',.*,>')
for LINE in open(DATA):
match = p_number.findall(LINE)
if match:
print LINE
但这也无济于事。
但现在它适用于:
import re
TAG = "light"
for LINE in open(DATA):
STRING = "<tag,"
if STRING in LINE:
if re.search(r'\b{}\b'.format(TAG), LINE):
print (LINE)
答案 0 :(得分:6)
您可以使用下面的正则表达式,\b
匹配word boundary,它只匹配单词的开头或结尾,因此如果它是一个子字符串,它将不匹配light
import re
LINES = ['moon sunlightcream potato', 'dishes light jellybeans']
match_tag = 'light'
for LINE in LINES:
# you could also use re.search(r'\b' + match_tag + r'\b', LINE)
if re.search(r'\b{}\b'.format(match_tag), LINE):
print (LINE)
# only print 'dishes light jellybeans'