Question

我有一个文件data.txt：

<tag,1>moon sunlightcream potato</tag>
<tag,2>dishes light jellybeans</tag>

和python文件match.py：

for LINE in open("data.txt"):
    STRING = "light"
    if STRING in LINE:
        print (LINE)

输出结果为：

<tag,1>moon sunlightcream potato</tag>

<tag,2>dishes light jellybeans</tag>

我只想：

dishes light jellybeans

我该怎么做？

更大的背景是：

TAG = QUERY.split("&")[1]
LIST = []
for LINE in open(DATA):
    STRING = "<tag,"
    if STRING in LINE:
        if TAG in LINE:
            print LINE

所以我不能这样做" light "！因为"light"是一个变量。所以我似乎不能这样做：" light "

正则表达式选项是：

 import re

 def sub_list():
     TAG = "light"
     p_number = re.compile(r'<tag,.*,' + TAG + ',.*,>')
     for LINE in open(DATA):
         match = p_number.findall(LINE)
         if match:
             print LINE

但这也无济于事。

但现在它适用于：

import re

TAG = "light"
for LINE in open(DATA):
    STRING = "<tag,"
    if STRING in LINE:
        if re.search(r'\b{}\b'.format(TAG), LINE):
            print (LINE)

Answer 1

您可以使用下面的正则表达式，\b匹配word boundary，它只匹配单词的开头或结尾，因此如果它是一个子字符串，它将不匹配light

import re
LINES = ['moon sunlightcream potato', 'dishes light jellybeans']
match_tag = 'light'
for LINE in LINES:
  # you could also use re.search(r'\b' + match_tag + r'\b', LINE)
  if re.search(r'\b{}\b'.format(match_tag), LINE):
    print (LINE)
# only print 'dishes light jellybeans'

仅当找到匹配的正则表达式时才打印行在Python中的空格之间

1 个答案: