Question

我想到了下面的内容，该内容在一行中找到一个字符串并将该行复制到一个新文件中。我想用更动态的内容（例如[0-9]等）代替Foo23，但是我无法使它或变量或正则表达式正常工作。它不会失败，但是我也没有任何结果。救命？谢谢。

with open('C:/path/to/file/input.csv') as f:
    with open('C:/path/to/file/output.csv', "w") as f1:
        for line in f:
            if "Foo23" in line:
                f1.write(line)

Answer 1

根据您的评论，您希望每当出现三个字母后跟两个数字的情况下匹配行，例如foo12和bar54。使用正则表达式！

import re
pattern = r'([a-zA-Z]{3}\d{2})\b'
for line in f:
    if re.findall(pattern, line):
        f1.write(line)

这将匹配'some line foo12'和'another foo54 line'之类的行，但不会匹配'a third line foo'或'something bar123'等行。

打破现状：

pattern = r'(                  # start capture group, not needed here, but nice if you want the actual match back
             [a-zA-Z]{3}       # any three letters in a row, any case
                        \d{2}  # any two digits
            )                  # end capture group
            \b                 # any word break (white space or end of line)
           '

如果您真正需要的只是将文件中的所有匹配项写入f1，则可以使用：

matches = re.findall(pattern, f.read())  # finds all matches in f
f1.write('\n'.join(matches))  # writes each match to a new line in f1

Answer 2

从本质上讲，您的问题可以归结为：“我想确定字符串是否与模式X匹配，如果匹配，则将其输出到文件中”。最好的方法是使用正则表达式。在Python中，标准的正则表达式库为re。所以，

import re
matches = re.findall(r'([a-zA-Z]{3}\d{2})', line)

将其与文件IO操作相结合，我们可以：

data = []
with open('C:/path/to/file/input.csv', 'r') as f:
     data = list(f)

data = [ x for x in data if re.findall(r'([a-zA-Z]{3}\d{2})\b', line) ]
with open('C:/path/to/file/output.csv', 'w') as f1:
    for line in data:
        f1.write(line)

请注意，我拆分了您的文件IO操作以减少嵌套。我还删除了您IO之外的过滤条件。通常，为了便于测试和维护，代码的每一部分都应该“做一件事”。

无法用变量替换字符串

2 个答案: