使用python从文本文件中获取IP

时间:2015-11-28 05:00:58

标签: python regex file

我正在创建自定义代码,用于从文本文件中获取IP和其他一些所需的详细信息。

考虑文本文件具有以下内容:tenant_id和IP

cbdf25542c194a069464f69efff4859a 45.45.45.45
cbdf25542c194a069464f69efff4859a 1.6.7.3
cbdf25542c194a069464f69efff4859a 1.7.6.2
1235b3a73ad24b9c86cf301525310b24 2.3.7.5
1235b3a73ad24b9c86cf301525310b24 6.5.2.1

现在我已经创建了分别获取IP和租户的代码。

代码如下:

files = open("/root/flattext", "r")

# create an empty list
ips = [] 
tenants = []

# read through the files
for text in files.readlines():

    # strip off the \n
    text = text.rstrip()

        # IP and Tenant Fetch
        regex = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})$', text)
        regex1 = re.findall(r'[0-9A-Za-z]{32}', text)

        if regex is not None and regex not in ips:
                ips.append(regex)

        if regex1 is not None and regex1 not in tenants:
                tenants.append(regex1)

ip_valuess = [''.join(ip) for ip in ips if ip]
tenant_ids = [''.join(tenant) for tenant in tenants if tenant]

# cleanup and close files
files.close()

所以它将把由IP和Tenant_id组成的结果作为单独的列表。

我需要的是获取特定租户ID下的IP。

将1235b3a73ad24b9c86cf301525310b24视为tenant_id, 所以应该给出结果 2.3.7.5,6.5.2.1。

有人请看一下,给我一个更好的解决方法。

2 个答案:

答案 0 :(得分:2)

为什么只使用regex使用split只需defaultdict -

from collections import defaultdict
data = defaultdict(list)
with open(r"D:\ip.txt",'rb') as fl:
    for i in fl.readlines():
        i=i.strip()
        data[i.split(" ")[0]].append(i.split(" ")[1])
print data.items()

输出 -

[('1235b3a73ad24b9c86cf301525310b24', ['2.3.7.5', '6.5.2.1']), ('cbdf25542c194a069464f69efff4859a', ['45.45.45.45', '1.6.7.3', '1.7.6.2'])]

如果您的文件没有结构化且没有可拆分的空间,请尝试regex -

import re
from collections import defaultdict
data = defaultdict(list)
pattern_ip = r'([\d]{1,3}(?=\.|$))'
pattern_tenat = r'^[a-z0-9]{32}'

with open(r"D:\ip.txt",'rb') as fl:
    for i in fl.readlines():
        i=i.strip()
        ip = '.'.join(re.findall(pattern_ip,i))
        tent = ''.join(re.findall(pattern_tenat,i))
        data[tent].append(ip)
print data.items()

输出 -

[('1235b3a73ad24b9c86cf301525310b24', ['2.3.7.5', '6.5.2.1']), ('cbdf25542c194a069464f69efff4859a', ['45.45.45.45', '1.6.7.3', '1.7.6.2'])]

请参阅regex LIVE DEMOTENANT DEMOIP

答案 1 :(得分:1)

使用 split defaultdict

from collections import defaultdict

results = defaultdict(list)

with open('flattext', 'r') as f:
    for row in f.read().strip().split('\n'):
        if row.strip() != "":
            tenant_id, ip = row.split()
            results[tenant_id].append(ip)

print results.get('1235b3a73ad24b9c86cf301525310b24', None)
print results.items()

<强>输出:

['2.3.7.5', '6.5.2.1']

results内容:

[
  ('1235b3a73ad24b9c86cf301525310b24', ['2.3.7.5', '6.5.2.1']),
  ('cbdf25542c194a069464f69efff4859a', ['45.45.45.45', '1.6.7.3', '1.7.6.2'])
]