python每天每小时发现一次攻击次数

时间:2012-03-11 23:09:41

标签: python logging

嘿,我试图发现每个ip每天都记录了很多攻击。我正在读取系统日志文件。

这里有一对情侣线正在读

引用...

Jan 10 09:32:09 j4-be03 sshd[3876]: Failed password for root from 218.241.173.35 port 47084 ssh2
Jan 10 09:32:19 j4-be03 sshd[3879]: Failed password for root from 218.241.173.35 port 47901 ssh2
Feb 7 17:19:16 j4-be03 sshd[10736]: Failed password for root from 89.249.209.92 port 46139 ssh2 

这是我的代码:

desc_date = {}     
count_date = 0
desc_ip = {}
count_ip = 0

for line in myfile:
    if 'Failed password for' in line:     
        line_of_list = line.split()     
        #working together
        date_port = ' '.join(line_of_list[0:2])
        date_list = date_port.split(':')
        date = date_list[0]
        if desc_date.has_key(date):
            count_date = desc_date[date]
            count_date = count_date +1
            desc_date[date] = count_date
            #zero out the temporary counter as a precaution
            count_date =0
        else:
            desc_date[date] = 1

        ip_port = line_of_list[-4]
        ip_list = ip_port.split(':')
        ip_address = ip_list[0]
        if desc_ip.has_key(ip_address):
            count_ip = desc_ip[ip_address]
            count_ip = count_ip +1
            desc_ip[ip_address] = count_ip
            #zero out the temporary counter as a precaution
            count_ip =0
        else:
            desc_ip[ip_address] = 1

        resulting = dict(desc_date.items() + desc_ip.items())
        for result in resulting:
            print result,' has', resulting[result] , ' attacks'

目前给我这些错误的结果:

引用...

Feb 8 has 33 attacks
218.241.173.35 has 15 attacks
72.153.93.203 has 14 attacks
213.251.192.26 has 13 attacks
66.30.90.148 has 14 attacks
Feb 7 has 15 attacks
92.152.92.123 has 5 attacks
Jan 10 has 28 attacks
89.249.209.92 has 15 attacks 

哪些ip地址错误,不确定代码中哪里出错,希望有人可以提供帮助

3 个答案:

答案 0 :(得分:4)

尝试这个解决方案,我在问题中使用示例输入进行了测试并且工作正常:

import re
from collections import defaultdict
pattern = re.compile(r'(\w{3}\s+\d{1,2}).+Failed password for .+? from (\S+)')

def attack_dict(myfile):
    attacks = defaultdict(lambda: defaultdict(int))
    for line in myfile:
        found = pattern.match(line)
        if found:
            date, ip = found.groups()
            attacks[date][ip] += 1
    return attacks

def report(myfile):
    for date, ips in attack_dict(myfile).iteritems():
        print '{0} has {1} attacks'.format(date, sum(ips.itervalues()))
        for ip, n in ips.iteritems():
            print '\t{0} has {1} attacks'.format(ip, n)

像这样运行:

report(myfile) # myfile is the opened file with the log

答案 1 :(得分:2)

我看到两个问题。 1)您按天,按IP和按端口计算攻击次数;在给定IP的攻击和攻击日期之间没有关联。 2)像在

中一样,遍历字典中的项目
resulting = dict(desc_date.items() + desc_ip.items())
for result in resulting:
    print result,' has', resulting[result] , ' attacks'

将以基本随机的顺序提供累积的攻击次数,并按照IP的攻击自由混合攻击。你看到的事实

Feb 8 has 33 attacks

......接着是

218.241.173.35 has 15 attacks
72.153.93.203 has 14 attacks
213.251.192.26 has 13 attacks
66.30.90.148 has 14 attacks

...并不意味着知识产权的攻击发生在2月8日。

来自218.241.173.35的15次攻击表示该日志文件涵盖的整个期间来自该IP的攻击总数。 218.241.173.35的线路偶然发生在2月8日之后,而不是之前或其他日期之后。

答案 2 :(得分:0)

Waning:未经测试的代码。

attacks = {}

# count the attacks
for line in file:
    if 'Failed password for' in line:
        date = re.match(line, '^(\w{3}\b\d{1,2})\b').group(1)
        attacks_date = attacks.get(date, {})
        ip = re.match(line, '\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b').group(1)
        attacks_date[ip] = 1 + attacks_date.get(ip, 0)
        attacks[date] = attacks_date

# output results
for item in attacks.items():
    date, attacks_date = item
    print date, 'has', attacks_date.values().sum(), 'attacks'
    for attack_item in attacks_date.items():
        ip, n = attack_item
        print ip, 'has', n, 'attacks'