我正在尝试仅从文件中提取IP,以数字方式组织它们并将结果放在另一个文件中。
数据如下所示:
The Spammer (and all his/her info):
Username: user
User ID Number: 0
User Registration IP Address: 77.123.134.132
User IP Address for Selected Post: 177.43.168.35
User Email: email@address.com
这是我的代码,它没有正确排序IP(即它在77.123.134.132之前列出了177.43.168.35):
import re
spammers = open('spammers.txt', "r")
ips = []
for text in spammers.readlines():
text = text.rstrip()
print text
regex = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})$',text)
if regex is not None and regex not in ips:
ips.append(regex)
for ip in ips:
OrganizedIPs = open("Organized IPs.txt", "a")
addy = "".join(ip)
if addy is not '':
print "IP: %s" % (addy)
OrganizedIPs.write(addy)
OrganizedIPs.write("\n")
spammers.close()
OrganizedIPs.close()
organize = open("Organized IPs.txt", "r")
ips = organize.readlines();
ips = list(set(ips))
print ips
for i in range(len(ips)):
ips[i] = ips[i].replace('\n', '')
print ips
ips.sort()
finish = open('organized IPs.txt', 'w')
finish.write('\n'.join(ips))
finish.close()
clean = open('spammers.txt', 'w')
clean.close()
我曾尝试使用this IP sorter code,但它需要一个字符串,因为正则表达式会返回一个列表。
答案 0 :(得分:3)
或者这个(保存字符串格式化费用):
def ipsort (ip):
return tuple (int (t) for t in ip.split ('.') )
ips = ['1.2.3.4', '100.2.3.4', '62.1.2.3', '62.1.22.4']
print (sorted (ips, key = ipsort) )
答案 1 :(得分:0)
试试这个:
sorted_ips = sorted(ips, key=lambda x: '.'.join(["{:>03}".format(octet) for octet in x.split(".")])
答案 2 :(得分:0)
import re
LOG = "spammers.txt"
IPV4 = re.compile(r"(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})")
RESULT = "organized_ips.txt"
def get_ips(fname):
with open(fname) as inf:
return IPV4.findall(inf.read())
def numeric_ip(ip):
return [int(i) for i in ip.split(".")]
def write_to(fname, iterable, fmt):
with open(fname, "w") as outf:
for i in iterable:
outf.write(fmt.format(i))
def main():
ips = get_ips(LOG)
ips = list(set(ips)) # uniquify
ips.sort(key=numeric_ip)
write_to(RESULT, ips, "IP: {}\n")
if __name__=="__main__":
main()