我有这个字符串“IP 1.2.3.4当前在白名单中受信任,但它现在在日志文件中使用新的可信证书。”。我需要做的是查找此消息并从日志文件中提取IP地址(1.2.3.4)。
import os
import shutil
import optparse
import sys
def main():
file = open("messages", "r")
log_data = file.read()
file.close()
search_str = "is currently trusted in the white list, but it is now using a new trusted certificate."
index = log_data.find(search_str)
print index
return
if __name__ == '__main__':
main()
如何提取IP地址?感谢您的回复。
答案 0 :(得分:5)
答案非常简单:
msg = "IP 1.2.3.4 is currently trusted in the white list, but it is now using a new trusted certificate."
parts = msg.split(' ', 2)
print parts[1]
结果:
1.2.3.4
如果你愿意,你也可以做RE,但对于这个简单的事情......
答案 1 :(得分:2)
将有许多可能的方法,优点和缺点取决于您的日志文件的详细信息。一个例子,使用re module:
import re
x = "IP 1.2.3.4 is currently trusted in the white list, but it is now using a new trusted certificate."
pattern = "IP ([0-9\.]+) is currently trusted in the white list"
m = re.match(pattern, x)
for ip in m.groups():
print ip
如果要在日志文件中打印出该字符串的每个实例,您可以执行以下操作:
import re
pattern = "(IP [9-0\.]+ is currently trusted in the white list, but it is now using a new trusted certificate.)"
m = re.match(pattern, log_data)
for match in m.groups():
print match
答案 2 :(得分:1)
使用正则表达式。
这样的代码:
import re
compiled = re.compile(r"""
.*? # Leading junk
(?P<ipaddress>\d+\.\d+\.\d+\.\d+) # IP address
.*? # Trailing junk
""", re.VERBOSE)
str = "IP 1.2.3.4 is currently trusted in the white list, but it is now using a new trusted certificate."
m = compiled.match(str)
print m.group("ipaddress")
你明白了:
>>> import re
>>>
>>> compiled = re.compile(r"""
... .*? # Leading junk
... (?P<ipaddress>\d+\.\d+\.\d+\.\d+) # IP address
... .*? # Trailing junk
... """, re.VERBOSE)
>>> str = "IP 1.2.3.4 is currently trusted in the white list, but it is now using a new trusted certificate."
>>> m = compiled.match(str)
>>> print m.group("ipaddress")
1.2.3.4
另外,我在那里学到了一个匹配词典,groupdict():
>>>> str = "Peer 10.11.6.224 is currently trusted in the white list, but it is now using a new trusted certificate. Consider removing its likely outdated white list entry."
>>>> m = compiled.match(str)
>>>> print m.groupdict()
{'ipaddress': '10.11.6.224'}
后来:修好了。最初的'。*'正在吃你的第一个角色匹配。改变它是非贪婪的。为了保持一致性(但不是必要性),我也改变了尾随匹配。
答案 3 :(得分:1)
正则表达是要走的路。但是如果你不舒服地写它们,你可以试一下我写的小解析器(https://github.com/hgrecco/stringparser)。它将字符串格式转换为正则表达式。在您的情况下,您将执行以下操作:
from stringparser import Parser
parser = Parser("IP {} is currently trusted in the white list, but it is now using a new trusted certificate.")
ip = parser(text)
如果您有一个包含多行的文件,则可以将最后一行替换为:
with open("log.txt", "r") as fp:
ips = [parser(line) for line in fp]
祝你好运。