python正则表达式 - 在文件中搜索模式

时间:2018-05-11 06:09:50

标签: python regex

  

编写一个程序,仅输出/usr/share/dict/words中以字母“ply”开头的单词。它应该按顺序输出单词,每个单词都在各自的行上。

import re

with open('words.txt', 'r') as words:

  pattern = re.compile(r'^ply.*')

  matches = pattern.match(words)

  for the match in matches:
    print(match)

我做错了什么?

2 个答案:

答案 0 :(得分:0)

如果RE在字符串的开头匹配,则

match()有效,因此您在RE中不需要额外的'^'。假设单词文件与代码位于同一文件夹中,下面的代码应该可以正常工作。

import re

pattern = re.compile(r'ply.*')
with open('words.txt', 'r') as lines:
    for line in lines:
        if pattern.match(line):
            print(line)

答案 1 :(得分:-1)

如果您没有考虑这种情况,可以使用以下正则表达式:

(?i)\bply[^\s]*

它在我的发行词词文件中提供以下输出:

Plymouth
Plymouth's
ply
ply's
plying
plywood
plywood's

如果案件很重要,请使用:

\bply[^\s]*

它在我的发行词词文件中提供以下输出:

ply
ply's
plying
plywood
plywood's

<强> DISTRO:

Distributor ID: Ubuntu
Description:    Ubuntu 16.04.4 LTS
Release:        16.04
Codename:       xenial

您的系统的结果可能会有所不同。

您可以在Python代码中添加此正则表达式,以获得以下工作示例:

$ more plywords.py
import re

#open the file with its full path and in read-only mode
with open('/usr/share/dict/words', 'r') as file:
  pattern = re.compile(r'(?i)\bply[^\s]*') #define the regex that you are going to use to analyse the text
  #for each line of your file, you fetch all the matches and loop on them via the inner loop; for each match found you print it.
  for line in file:
        for m in pattern.findall(line):
                print(m)

并运行它:

$ python plywords.py 
Plymouth
Plymouth's
ply
ply's
plying
plywood
plywood's