Question

我正在使用以下正则表达式同时搜索3种不同的字符串格式。另外，我使用re.IGNORECASE来匹配大写和小写字符串。但是，当我执行搜索（例如'locality'）时，我能够获得'localit'，'locali'，'local'等字符串匹配等等。我想匹配确切的单词（例如'locality'）。

此外，如果字符串字符之间有空格（例如，'l ocal i ty'），我想忽略它。我还没有找到允许我这样做的re方法。我尝试使用re.ASCII，但是我收到错误：“... ascii无效！”任何帮助表示赞赏。

elif searchType =='2':
  print "  Directory to be searched: c:\Python27 "
  directory = os.path.join("c:\\","Python27")
  userstring = raw_input("Enter a string name to search: ")
  userStrHEX = userstring.encode('hex')
  userStrASCII = ' '.join(str(ord(char)) for char in userstring)
  regex = re.compile(r"(%s|%s|%s)" % ( re.escape( userstring ), re.escape( userStrHEX ), re.escape( userStrASCII ))re.IGNORECASE)
  for root,dirname, files in os.walk(directory):
     for file in files:
         if file.endswith(".log") or file.endswith(".txt"):
            f=open(os.path.join(root, file))
            for line in f.readlines():
               #if userstring in line:
               if regex.search(line):       
                  print "file: " + os.path.join(root,file)           
                  break
            else:
               #print "String NOT Found!"
               break
            f.close()

Answer 1

re中没有这样的标志，所以：

在每个char之后构造一个带有显式空格匹配的正则表达式：

r'\s*'.join(c for c in userStrASCII)

这样做有效：myre.findall(line)找到'l Oc ai ty'
或（如果您只需要检测与模式匹配，但不对实际匹配文本做任何进一步操作）使用string.translate(,deleteChars)在匹配之前从行中去除空格。例如在尝试匹配之前执行line.translate(None, ' \t\n\r').lower()。（保留未经编辑的行的副本。）

正则表达式不区分大小写的搜索与确切的单词不匹配

1 个答案: