我在日志文件中有243607个ips。函数的输出连续显示唯一的ips,因此我无法检查输出ips是否唯一。所以我希望每个ip都能在单独的行中打印。因为我是python的新手,我无法弄明白。有没有办法做到这一点?
我也想要打印的ips计数
def unique_ips():
f = open('epiclogs.txt','r')
ips = set(line.split()[0]
for line in f:
if not line.isspace())
ip = line.split()[0]
ips.add(ip)
return ips
if name__=='__main':
print unique_ips()
答案 0 :(得分:2)
要求不完整:
我的假设
10.1.10.190 http://example.com/t1 404
10.1.10.171 http://example.com/t1 404
10.1.10.180 http://example.com/t2 200
10.1.10.190 http://example.com/t1 404
10.1.11.180 http://example.com/t3 302
#!/usr/bin/env python
#
# Counts the IP addresses of a log file.
#
# Assumption: the IP address is logged in the first column.
# Example line: 10.1.10.190 http://example.com/t1 404
#
import sys
def extract_ip(line):
'''Extracts the IP address from the line.
Currently it is assumed, that the IP address is logged in
the first column and the columns are space separated.'''
return line.split()[0]
def increase_count(ip_dict, ip_addr):
'''Increases the count of the IP address.
If an IP address is not in the given dictionary,
it is initially created and the count is set to 1.'''
if ip_addr in ip_dict:
ip_dict[ip_addr] += 1
else:
ip_dict[ip_addr] = 1
def read_ips(infilename):
'''Read the IP addresses from the file and store (count)
them in a dictionary - returns the dictionary.'''
res_dict = {}
log_file = file(infilename)
for line in log_file:
if line.isspace():
continue
ip_addr = extract_ip(line)
increase_count(res_dict, ip_addr)
return res_dict
def write_ips(outfilename, ip_dict):
'''Write out the count and the IP addresses.'''
out_file = file(outfilename, "w")
for ip_addr, count in ip_dict.iteritems():
out_file.write("%5d\t%s\n" % (count, ip_addr))
out_file.close()
def parse_cmd_line_args():
'''Return the in and out file name.
If there are more or less than two parameters,
an error is logged in the program is exited.'''
if len(sys.argv)!=3:
print("Usage: %s [infilename] [outfilename]" % sys.argv[0])
sys.exit(1)
return sys.argv[1], sys.argv[2]
def main():
infilename, outfilename = parse_cmd_line_args()
ip_dict = read_ips(infilename)
write_ips(outfilename, ip_dict)
if __name__ == "__main__":
main()
我喜欢小功能 - 每个功能都只做一件事。恕我直言,这使程序更容易理解。
答案 1 :(得分:0)
没有检查你的代码是否有效,但是添加了新的代码,这可以完成你的任务。
试试这个,
def unique_ips():
f = open('epiclogs.txt','r')
fout = open('uniqueip.txt','w') # Added
ips = set(line.split()[0]
for line in f:
if not line.isspace()):
ip = line.split()[0]
ips.add(ip)
fout.write("%s\n"%ip) # Added
f.close() # Added
fout.flush() # Added
fout.close() # Added
return ips
if name__=='__main':
print unique_ips()
答案 2 :(得分:0)
unique_ips()
返回set
,这意味着每个IP地址只出现一次。如果您想在文件中逐行查看地址,可以将print unique_ips()
行更改为:
if __name__== '__main__':
f = file('ip_addresses', 'w')
for ip in unique_ips():
f.write(ip + '\n')