Python:将IP地址列表作为字符串列表传递

时间:2017-10-24 19:48:37

标签: python string list geolocation ip-address

我的代码旨在从文本文件中对地址进行地理定位。我在最后一节遇到了麻烦。当我运行代码时,我收到了map_ip.update行的投诉:socket.error: illegal IP address string passed to inet_pton

当我使用print语句进行疑难解答时,我会得到以下格式:

['$ ip address']
['$ ip address']
['$ ip address']

如何让country_name_by_addr()以正确的格式读取每个IP地址?看来我的IP地址被格式化为单个列表中的字符串列表。

# script that geo-locates IP addresses from a consolidated dictionary

    import pygeoip
    import itertools
    import re

    # initialize dictionary for IP addresses
    count = {}

    """
    This loop reads text file line-by-line and
    returns one-to-one key:value pairs of IP addresses.
    """
    with open('$short_logins.txt path') as f:
      for cnt, line in enumerate(f):
        ip = re.findall(r'[0-9]+(?:\.[0-9]+){3}', line)
        count.update({cnt: ip})
        cnt += 1

    """
    This line consolidates unique IP addresses.  Keys represent how 
    many times each unique IP address occurs in the text file.
    """
    con_count = [(k, len(list(v))) for k, v in itertools.groupby(sorted(count.values)))]    


    """
    Country lookup:
    This section passes each unique IP address from con_count 
    through country name database.  These IP address are not required
    to come from con_count.
    """
    map_ip = {}
    gi = pygeoip.GeoIP('$GeoIP.dat path')

    for i in count.itervalues():
      map_ip.update({i: gi.country_name_by_addr(i)})

    print map_ip

1 个答案:

答案 0 :(得分:0)

所以我昨天通过废除正则表达式解决了这个困境:

ip = re.findall(r'[0-9]+(?:\.[0-9]+){3}', line)

我通过剥离文件中的空白并检查是否考虑了IP地址,找到了一个更简单的解决方案。 IP地址都在第三列,因此[2]:

ip = line.split()[2]
if ip in count:
  count[ip] += 1
else:
  count.update({ip: 1})

我也删除了con_count行。 Pygeoip函数更容易接受不是由正则表达式组成的列表。