Question

我编写此代码以搜索文本中的确切字词（％PDF-1.1）

import re
x = "%PDF-1.1 pdf file contains four parts one of them the header part which looks like "
s = re.compile("%PDF-\d\.\d[\b\s]") 
match = re.search("%PDF-\d\.\d[\b\s]",x)
if match:
    print match.group()
else:
    print "its not found"

但问题是，如果我有“％PDF-1.1”，它会返回结果％PDF-1.1，但这是错误的当x =“pdf文件包含四个部分时，其中一个标题部分看起来像％PDF-1.1”它什么都没给我

我怎么能搜索确切的单词????

Answer 1

目前，您正在搜索单词“％PDF-X-X”（其中X是数字），然后是更多内容，而不关心它之前的内容。如果你只想在字符串的开头，结尾搜索这个单词，或者如果它是一个单词（我假设它前后有空格），你可以试试这个：

import re
x = "%PDF-1.1 pdf file contains four parts one of them the header part which looks like "
y = "pdf file contains four parts one of them the header part which looks like %PDF-1.1"
s = re.compile("(^|\s)(?P<myword>%PDF-\d\.\d)($|\s)") 
match = s.search(x)
if match:
    print match.group("myword")
else:
    print "its not found"

match = s.search(y)
if match:
    print match.group("myword")
else:
    print "its not found"

# %PDF-1.1
# %PDF-1.1

如果你想要的话，如果后面跟着一个符号，你也可以找到这样的词，你可以做出类似的东西，这样就可以得到任何不是字母或数字的东西：

s = re.compile("(^|\s)(?P<myword>%PDF-\d\.\d)($|\s|[^a-zA-Z0-9])")

如何在python中搜索确切的单词？

1 个答案: