re.findall搜索以pattern开头的变量

时间:2017-09-14 23:46:08

标签: regex

打开文件并尝试计算其中变量的出现次数。

ATTRS = ['test1', 'test2', 'test3']
with open('_file_name', 'r') as fh:
    contents = fh.read():
    for attr in ATTRS:
        count = len(re.findall(attr, contents))
        print count

代码似乎工作正常,可以检查文件中任何位置的匹配字符串。但是,我想仅在行的开头搜索出现次数。

1 个答案:

答案 0 :(得分:0)

看看这是否适合您。它是python3中的一个简单代码。

def counter(attrs):
    with open('dummy.txt', 'r') as f:
        contents = f.read()
        for attr in attrs:
            count = 0
            rp = re.compile('^\s*' + attr + '\\b')
            for r in contents.split('\n'):
                matches = rp.match(r)
                if matches != None:
                    count += 1
            print(count)

考虑像这样的虚拟文件:

hello 67989 hello
hello 67989 hello hello 67989 hello
    hello 67989 hello
    hello 67989 hello
pssss hello

测试attrs = ['s', 'hello', 'pssss']的代码:

In [12]: attrs = ['s', 'hello', 'pssss']
In [13]: counter(attrs)
       0
       4
       1

此代码考虑该行的第一行单词,包括缩进。如果您希望它严格地位于该行的开头,请从正则表达式中删除\s*

说明:

^    -> Start of string
\s*  -> 0 or more space like characters (including tabs)
attr -> Dynmaic attribute like `hello`
\\b  -> Word boundary to make sure when you search `hello`, `hellohello` doesn't match