使用Python排序apache日志

时间:2016-08-09 10:41:48

标签: python apache python-2.7

例如,我有这个简单的apache日志:

192.168.1.1 GET /index.php
192.168.1.1 GET /pilt.png
192.168.1.1 GET /index.php
192.168.1.5 GET /index.php
192.168.1.5 GET /pilt.png
192.168.1.7 GET /index.php
192.168.1.7 GET /index.php
192.168.1.7 GET /index.php
192.168.1.7 GET /kaust/index.php
192.168.1.7 GET /index.php

我如何编写Python代码来对其进行排序,将所有类似的IP地址放在一起并计算出有多少IP地址

w = open("C:\\Users\\xxx\\Desktop\\test.txt","r")

for i in w:
  log=i.split(' ')
  print log[0]
w.close()

已经尝试过这么多,但无法进一步编写代码。

谢谢!

2 个答案:

答案 0 :(得分:0)

这是如何完成的:

x = open('PATH_TO_FILE').read()

from itertools import groupby
from operator import itemgetter
x = x.split('\n')
for i in range(len(x)):
    x[i] = x[i].split(' ')

j = 0

for elt, items in groupby(x, itemgetter(0)):
    j += 1
    k = 0
    print elt, items
    for i in items:
        k += 1
        print i
    print 'Total count for IP ',i[0],' is :',k

print 'Total unique IP address are : ',j

答案 1 :(得分:0)

您可以将defaultdict(int)用于您的目的:

from collections import defaultdict
my_dict = defaultdict(int)
w = open("C:\\Users\\xxx\\Desktop\\test.txt", "r")
for line in w:
    ip = line.split(' ')[0]
    my_dict[ip]+=1

my_dict  # defaultdict(<class 'int'>, {'192.168.1.7': 5, '192.168.1.1': 3, '192.168.1.5': 2})