Question

我的代码旨在从文本文件中对地址进行地理定位。我在最后一节遇到了麻烦。当我运行代码时，我收到了map_ip.update行的投诉：socket.error: illegal IP address string passed to inet_pton

当我使用print语句进行疑难解答时，我会得到以下格式：

['$ ip address']
['$ ip address']
['$ ip address']

如何让country_name_by_addr()以正确的格式读取每个IP地址？看来我的IP地址被格式化为单个列表中的字符串列表。

# script that geo-locates IP addresses from a consolidated dictionary

    import pygeoip
    import itertools
    import re

    # initialize dictionary for IP addresses
    count = {}

    """
    This loop reads text file line-by-line and
    returns one-to-one key:value pairs of IP addresses.
    """
    with open('$short_logins.txt path') as f:
      for cnt, line in enumerate(f):
        ip = re.findall(r'[0-9]+(?:\.[0-9]+){3}', line)
        count.update({cnt: ip})
        cnt += 1

    """
    This line consolidates unique IP addresses.  Keys represent how 
    many times each unique IP address occurs in the text file.
    """
    con_count = [(k, len(list(v))) for k, v in itertools.groupby(sorted(count.values)))]    


    """
    Country lookup:
    This section passes each unique IP address from con_count 
    through country name database.  These IP address are not required
    to come from con_count.
    """
    map_ip = {}
    gi = pygeoip.GeoIP('$GeoIP.dat path')

    for i in count.itervalues():
      map_ip.update({i: gi.country_name_by_addr(i)})

    print map_ip

Answer 1

所以我昨天通过废除正则表达式解决了这个困境：

ip = re.findall(r'[0-9]+(?:\.[0-9]+){3}', line)

我通过剥离文件中的空白并检查是否考虑了IP地址，找到了一个更简单的解决方案。 IP地址都在第三列，因此[2]：

ip = line.split()[2]
if ip in count:
  count[ip] += 1
else:
  count.update({ip: 1})

我也删除了con_count行。 Pygeoip函数更容易接受不是由正则表达式组成的列表。

Python：将IP地址列表作为字符串列表传递

1 个答案: