如何编写与文件匹配的所有行

时间:2014-07-22 13:43:35

标签: regex python-2.7

我想查看目录中的每个文件,如果匹配(文件内容中的特定IP),则将这些匹配写入另一个目录中的新文件。并为目录中的每个文件执行此操作。 到目前为止,我有这个,但它只在新文件中写入一行。你能帮忙吗?

import os, re

wanted = ['10.10.10.10']
dir_list = os.listdir('D:\\path\\07')

for i in dir_list:
    n = open('D:\\path\\07\\'+i,'r')
    m=n.readlines()
    for line in m:
        if  any(wanted_word in line for wanted_word in wanted):
            with open('Z:\\PYTHON\\Filtered-'+i,'w') as filtered_log:
                filtered_log.write(line)

我也尝试了这个 - 没有......没有错误,甚至没有结果。

import re, os

regex = "(.*)10.10.10.10(.*)"

dir_list = os.listdir('D:\\path\\07')

for i in dir_list:
    n = open('D:\\path\\07\\'+i,'r')
    for line in n:
        if re.match(regex, line):
            with open('Z:\\PYTHON\\Filtered_'+i,'w') as filtered_log:
                filtered_log.write(lines)

2 个答案:

答案 0 :(得分:1)

.是正则表达式中的特殊字符,因此您需要确保将其转义:

>>> import re
>>> e = r'10\.10\.10\.10'
>>> s = "There are some IP addresses like 127.0.0.1 and 192.168.1.1, but the one I want is 10.10.10.10 and nothing else"
>>> s2 = "I only contain 192.168.0.1"
>>> re.search(e, s)
<_sre.SRE_Match object at 0x7f501fb771d0>
>>> re.search(e, s2)

您的代码发生的事情是,每次匹配一行时,您都会在写入模式下再次打开文件,这会删除文件的内容;有效的结果只是书面的最后一行。

您需要确保只打开文件一次以进行编写,然后在筛选目标目录中的所有文件后将其关闭:

import os
import re

e = r'10\.10\.10\.10'

base_directory = r'D:/path/07'
base_dir_out = r'Z:/Python/'

for f in os.listdir(base_directory):
    with open(os.path.join(base_directory, f), 'r') as in_file,
         open(os.path.join(base_dir_out, 'Filtered-{}'.format(f), 'w') as out:
          for line in in_file:
              if re.search(e, line):
                  out.write(line)

请注意以下事项:

  1. 即使在Windows中也可以使用/
  2. 组合文件路径时,应始终使用os.path.join

答案 1 :(得分:1)

您正在以w打开文件,这将截断上一个文件,这就是您只能在新文件中看到一行的原因。

wanted = ['10.10.10.10']
dir_list = os.listdir('D:\\path\\07')

for i in dir_list:
    n = open('D:\\path\\07\\'+i,'r')
    m=n.readlines()
    n.close()
    for line in m:
        if  any(wanted_word in line for wanted_word in wanted):
            tempFile = 'Z:\\PYTHON\\Filtered-' + i
            if exists(tempFile):
                with open('Z:\\PYTHON\\Filtered-'+i,'a') as filtered_log:
                    filtered_log.write(line)
            else:
                with open('Z:\\PYTHON\\Filtered-'+i,'w') as filtered_log:
                    filtered_log.write(line)