我的代码旨在从文本文件中对地址进行地理定位。我在最后一节遇到了麻烦。当我运行代码时,我收到了map_ip.update行的投诉:socket.error: illegal IP address string passed to inet_pton
当我使用print
语句进行疑难解答时,我会得到以下格式:
['$ ip address']
['$ ip address']
['$ ip address']
如何让country_name_by_addr()
以正确的格式读取每个IP地址?看来我的IP地址被格式化为单个列表中的字符串列表。
# script that geo-locates IP addresses from a consolidated dictionary
import pygeoip
import itertools
import re
# initialize dictionary for IP addresses
count = {}
"""
This loop reads text file line-by-line and
returns one-to-one key:value pairs of IP addresses.
"""
with open('$short_logins.txt path') as f:
for cnt, line in enumerate(f):
ip = re.findall(r'[0-9]+(?:\.[0-9]+){3}', line)
count.update({cnt: ip})
cnt += 1
"""
This line consolidates unique IP addresses. Keys represent how
many times each unique IP address occurs in the text file.
"""
con_count = [(k, len(list(v))) for k, v in itertools.groupby(sorted(count.values)))]
"""
Country lookup:
This section passes each unique IP address from con_count
through country name database. These IP address are not required
to come from con_count.
"""
map_ip = {}
gi = pygeoip.GeoIP('$GeoIP.dat path')
for i in count.itervalues():
map_ip.update({i: gi.country_name_by_addr(i)})
print map_ip
答案 0 :(得分:0)
所以我昨天通过废除正则表达式解决了这个困境:
ip = re.findall(r'[0-9]+(?:\.[0-9]+){3}', line)
我通过剥离文件中的空白并检查是否考虑了IP地址,找到了一个更简单的解决方案。 IP地址都在第三列,因此[2]:
ip = line.split()[2]
if ip in count:
count[ip] += 1
else:
count.update({ip: 1})
我也删除了con_count行。 Pygeoip函数更容易接受不是由正则表达式组成的列表。