Question

此代码每次在搜索的文件中输出匹配字符串一次（因此，如果字符串重复出现，我最终会得到一个巨大的列表）。我只想知道列表中的字符串是否匹配，而不是匹配的次数。我想知道哪些字符串匹配，因此真/假解决方案不起作用。但我只希望它们列出一次，如果匹配的话。我真的不明白pattern ='|'。join（keywords）部分正在做什么 - 我从别人的代码那里得到了我的文件到文件匹配工作，但不知道我是否需要它。非常感谢您的帮助。

# declares the files used
filenames = ['//Katie/Users/kitka/Documents/appreport.txt', '//Dallin/Users/dallin/Documents/appreport.txt' ,
             '//Aidan/Users/aidan/Documents/appreport.txt']

# parses each file
for filename in filenames:
    # imports the necessary libraries
    import os, time, re, smtplib
    from stat import * # ST_SIZE etc

    # finds the time the file was last modified and error checks
    try:
        st = os.stat(filename)
    except IOError:
        print("failed to get information about", filename)
    else:
        # creates a list of words to search for
        keywords = ['LoL', 'javaw']
        pattern = '|'.join(keywords)

        # searches the file for the strings in the list, sorts them and returns results
        results = []
        with open(filename, 'r') as f:
            for line in f:
                matches = re.findall(pattern, line)
                if matches:
                    results.append((line, len(matches)))

        results = sorted(results)

        # appends results to the archive file
        with open("GameReport.txt", "a") as f:
            for line in results:
                f.write(filename + '\n')
                f.write(time.asctime(time.localtime(st[ST_MTIME])) + '\n')
                f.write(str(line)+ '\n')

Answer 1

未经测试，但这应该有用。请注意，这只会跟踪找到的单词，而不是找到哪些单词在哪些文件中。我无法弄清楚这是否是你想要的。

import fileinput

filenames = [...]

keywords = ['LoL', 'javaw']

# a set is like a list but with no duplicates, so even if a keyword
# is found multiple times, it will only appear once in the set
found = set()

# iterate over the lines of all the files
for line in fileinput.input(files=filenames):
    for keyword in keywords:
        if keyword in line:
            found.add(keyword)

print(found)

修改

如果您想跟踪哪些关键字存在于哪些文件中，那么我建议保留一组（文件名，关键字）元组：

filenames = [...] keywords = ['LoL', 'javaw'] found = set() for filename in filenames: with open(filename, 'rt') as f: for line in f: for keyword in keywords: if keyword in line: found.add((filename, keyword)) for filename, keyword in found: print('Found the word "{}" in the file "{}"'.format(keyword, filename))

Python：如果找到一个字符串，停止搜索该字符串，搜索下一个字符串，并输出匹配的字符串

1 个答案: