Python:如果找到一个字符串,停止搜索该字符串,搜索下一个字符串,并输出匹配的字符串

时间:2017-09-26 05:13:36

标签: string python-3.x

此代码每次在搜索的文件中输出匹配字符串一次(因此,如果字符串重复出现,我最终会得到一个巨大的列表)。我只想知道列表中的字符串是否匹配,而不是匹配的次数。我想知道哪些字符串匹配,因此真/假解决方案不起作用。但我只希望它们列出一次,如果匹配的话。我真的不明白pattern ='|'。join(keywords)部分正在做什么 - 我从别人的代码那里得到了我的文件到文件匹配工作,但不知道我是否需要它。非常感谢您的帮助。

# declares the files used
filenames = ['//Katie/Users/kitka/Documents/appreport.txt', '//Dallin/Users/dallin/Documents/appreport.txt' ,
             '//Aidan/Users/aidan/Documents/appreport.txt']

# parses each file
for filename in filenames:
    # imports the necessary libraries
    import os, time, re, smtplib
    from stat import * # ST_SIZE etc

    # finds the time the file was last modified and error checks
    try:
        st = os.stat(filename)
    except IOError:
        print("failed to get information about", filename)
    else:
        # creates a list of words to search for
        keywords = ['LoL', 'javaw']
        pattern = '|'.join(keywords)

        # searches the file for the strings in the list, sorts them and returns results
        results = []
        with open(filename, 'r') as f:
            for line in f:
                matches = re.findall(pattern, line)
                if matches:
                    results.append((line, len(matches)))

        results = sorted(results)

        # appends results to the archive file
        with open("GameReport.txt", "a") as f:
            for line in results:
                f.write(filename + '\n')
                f.write(time.asctime(time.localtime(st[ST_MTIME])) + '\n')
                f.write(str(line)+ '\n')

1 个答案:

答案 0 :(得分:0)

未经测试,但这应该有用。请注意,这只会跟踪找到的单词,而不是找到哪些单词在哪些文件中。我无法弄清楚这是否是你想要的。

import fileinput

filenames = [...]

keywords = ['LoL', 'javaw']

# a set is like a list but with no duplicates, so even if a keyword
# is found multiple times, it will only appear once in the set
found = set()

# iterate over the lines of all the files
for line in fileinput.input(files=filenames):
    for keyword in keywords:
        if keyword in line:
            found.add(keyword)

print(found)

修改

如果您想跟踪哪些关键字存在于哪些文件中,那么我建议保留一组(文件名,关键字)元组:

filenames = [...]
keywords = ['LoL', 'javaw']
found = set()

for filename in filenames:
    with open(filename, 'rt') as f:
        for line in f:
            for keyword in keywords:
                if keyword in line:
                    found.add((filename, keyword))

for filename, keyword in found:
    print('Found the word "{}" in the file "{}"'.format(keyword, filename))