Question

所以我正在尝试创建一个小脚本来处理一些日志。我只是在学习python，但在其他语言中了解循环等。似乎我不太了解循环如何在python中工作。

我有一个原始日志，我试图仅隔离外部IP地址。示例行：

05/09/2011 17:00:18 192.168.111.26 192.168.111.255广播数据包丢失udp / netbios-ns 0 0 X0 0 0 N / A

以下是我迄今为止的代码：

import os,glob,fileinput,re

def parseips():
    f = open("126logs.txt",'rb')
    r = open("rawips.txt",'r+',os.O_NONBLOCK)

    for line in f:
        rf = open("rawips.txt",'r+',os.O_NONBLOCK)
        ip = line.split()[3]
        res=re.search('192.168.',ip)
        if not res:
            rf.flush()
            for line2 in rf:
                if ip not in line2:
                    r.write(ip+'\n')
                    print 'else write'
                else:
                    print "no"
    f.close()
    r.close()
    rf.close()  

parseips()

我解析了外部ip就好了。但是，像忍者一样思考，我觉得处理欺骗会有多酷？想法或思考过程是我可以检查ips正在写入的文件与当前行匹配，如果有匹配，请不要写。但是这比前面产生了多倍的欺骗:)我可能会使用别的东西，但我喜欢python而且它让我看起来很忙。

感谢任何内幕消息。

Answer 1

免责声明：由于你是python的新手，我将尝试炫耀一下，这样你就可以查找一些有趣的“python东西”。

我打算将所有IP打印到控制台：

def parseips():
    with open("126logs.txt",'r') as f:
        for line in f:
            ip = line.split()[3]
            if ip.startswith('192.168.'):
                print "%s\n" %ip,

您可能还想查看：

f = open("126logs.txt",'r')
IPs = [line.split()[3] for line in f if line.split()[3].startswith('192.168.')]

希望这有帮助，享受Python！

Answer 2

根据这一点，可能会有诀窍：

import os,glob,fileinput,re

def parseips():
    prefix = '192.168.'
    #preload partial IPs from existing file.
    if os.path.exists('rawips.txt'):
        with open('rawips.txt', 'rt') as f:
            partial_ips = set([ip[len(prefix):] for ip in f.readlines()])
    else:
        partial_ips = set()

    with open('126logs.txt','rt') as input, with open('rawips.txt', 'at') as output:
        for line in input:
            ip = line.split()[3]
            if ip.startswith(prefix) and not ip[len(prefix):] in partial_ips:
                partial_ips.add(ip[len(prefix):])
                output.write(ip + '\n')

parseips()

Answer 3

您可以尝试使用set，而不是遍历您正在撰写的文件。它可能消耗更多的内存，但你的代码会更好，所以除非遇到实际的内存约束，否则它可能是值得的。

Answer 4

假设您只是想避免重复的外部IP，请考虑创建一个额外的数据结构，以便跟踪已经写入的IP。因为它们是字符串格式，所以字典对此有好处。

externalIPDict = {}
#code to detect external IPs goes here- when you get one;
if externalIPString in externalIPDict:
    pass # do nothing, you found a dupe
else:
    externalIPDict[externalIPDict] = 1
    #your code to add the external IP to your file goes here

python在编写文件时使用它们

4 个答案: