我有一个文本文件格式的日志文件。日志文件看起来像以下格式
220.227.40.118 - - [06/Mar/2012:00:00:00 -0800] "GET /mysidebars/newtab.html
HTTP/1.1" 404 0 - -
220.227.40.118 - - [06/Mar/2012:00:00:00 -0800] "GET /hrefadd.xml HTTP/1.1"
204 214 - -
59.95.13.217 - - [06/Mar/2012:00:00:00 -0800] "GET /dbupdates2.xml HTTP/1.1"
404 0 - -
111.92.9.222 - - [06/Mar/2012:00:00:00 -0800] "GET /mysidebars/newtab.html
HTTP/1.1" 404 0 - -
120.56.236.46 - - [06/Mar/2012:00:00:00 -0800] "GET /hrefadd.xml HTTP/1.1"
204 214 - -
49.138.106.21 - - [06/Mar/2012:00:00:00 -0800] "GET /add.txt HTTP/1.1" 204
214 - -
117.195.185.130 - - [06/Mar/2012:00:00:00 -0800] "GET
/mysidebars/newtab.html HTTP/1.1" 404 0 - -
122.160.166.220 - - [06/Mar/2012:00:00:00 -0800] "GET
/mysidebars/newtab.html HTTP/1.1" 404 0 - -
117.214.20.28 - - [06/Mar/2012:00:00:00 -0800] "GET /welcome.html HTTP/1.1"
204 212 - -
117.18.231.5 - - [06/Mar/2012:00:00:00 -0800] "GET /mysidebars/newtab.html
HTTP/1.1" 404 0 - -
我想使用python找到日志文件中的每个唯一IP地址。
答案 0 :(得分:2)
怎么样:
def get_ips(logfile):
with open(logfile, 'r') as f:
for line in f.readlines():
yield line.split()[0]
def main():
for ip in set(get_ips('log.txt')):
print ip
if __name__ == '__main__':
main()
答案 1 :(得分:1)
以下是:
def unique_ips():
f = open('log_file.txt','r')
ips = set()
for line in f:
ip = line.split()[0]
ips.add(ip)
return ips
if __name__=='__main__':
print unique_ips()
这适用于python 2.6
。