Question

我需要使用一个文本文件中的字符串来搜索另一个文本文件，每次字符串在第二个文本文件中匹配时，在第二个字符串中搜索单词word，如果匹配，则创建第三个包含第二个文本文件中特定列的文本文件，并对第一个文本文件中的每个字符串重复。

示例

文字档案1：

10.2.1.1
10.2.1.2
10.2.1.3

文本文件2：

IP=10.2.1.4 word=apple thing=car name=joe
IP=10.2.1.3 word=apple thing=car name=joe
IP=10.2.1.1 word=apple thing=car name=joe
IP=10.2.1.2 word=apple thing=car name=joe
IP=10.2.1.1 word=apple thing=car name=joe
IP=10.2.1.3 word=apple thing=car name=joe

结果应该是三个单独的文本文件（以文本文件1中的字符串命名），每个包含第三列的字符串一个：

结果：10.2.1.3.txt

thing=car
thing=car

等

到目前为止，我的代码如下：

with open(file_1) as list_file:
    for string in (line.strip() for line in list_file):
        if string in file_2:
            if "word" in file_2:            
                column2 = line.split()[2]
                x = open(line+".txt", "a")
                with x as new_file:
                    new_file.write(column2)

我的问题是：这段代码是最好的方法吗？我觉得好像有一条重要的“捷径”，我不知道。

最终代码与 Olafur Osvaldsson ：

for line_1 in open(file_1): with open(line_1+'.txt', 'a') as my_file: for line_2 in open(file_2): line_2_split = line_2.split(' ') if "word" in line_2: if "word 2" in line_2: my_file.write(line_2_split[2] + '\n')

Answer 1

以下是一个示例，输入文件位于 file1.txt 和 file2.txt 中。我在字典'文件'中缓存文件1的内容及其关联输出文件句柄，然后在主循环结束后将其关闭。

在主循环中，我读入 file2.txt 的每一行，将其剥离，并使用 split 方法在空格上对其进行标记。然后我从第一个令牌中找到ip地址，并检查它是否在'files'中。如果是这样，我将第三列写入相应的输出文件。

最后一个循环关闭输出文件句柄。

with open('file1.txt') as file1:
    files = {ip:open(ip + '.txt', 'w') for ip in [line.strip() for line in file1]}

with open('file2.txt') as file2:
    for line in file2:
        tokens = line.strip().split(' ')
        ip = tokens[0][3:]
        if ip in files:
            files[ip].write(tokens[2])
            files[ip].write('\r\n')

for f in files.values():
    f.close()

Answer 2

我认为以下代码可以满足您的要求：

file_1='file1.txt'
file_2='file2.txt'

my_string = 'word'

for line_1 in [l.rstrip() for l in open(file_1)]:
    with open(line_1+'.txt', 'a') as my_file:
        for line_2 in open(file_2):
            line_2_split = line_2.split(' ')
            if line_1 == line_2_split[0][3:]:
                if my_string in line_2:
                    my_file.write(line_2_split[2] + '\n')

如果您打算使用file_2行中的最后一个参数，请确保从最后删除换行符，就像使用rstrip()对第一个文件执行的那样，我将其保留在以下行中file_2。

Answer 3

# define files
file1 = "file1.txt"
file2 = "file2.txt"

ip_patterns = set() # I assume that all patterns fits the memory

# filling ip_patterns
with open(file1) as fp:
    for line in fp: 
        ip_patterns.add(line.strip()) # adding pattern to the set


word_to_match = "apple" # pattern for the "word" field
wanted_fields = ['name', 'thing'] # fields to write

with open(file2) as fp:
    for line in fp:
        values = dict(map(lambda x: x.split('='), line.split()))
        if values['IP'] in ip_patterns and values['word'] == word_to_match:
            out = open(values['IP'] + '.txt', 'a')
            for k in wanted_fields:
                out.write("%s=%s\n" % (k, values[k])) # writing to file
            out.close()

如何使用一个文本文件中的字符串来搜索另一个文本文件，并使用另一个文本文件创建新的文本文件？

3 个答案: