Question

我正在尝试从包含多个ID的一条长行中从某些网络接口中提取ID。我已经尝试使用split失败了。我将不胜感激

这是输入的示例，请记住这是在一行文本上。

"Authentication success on Interface Gi1/0/20 AuditSessionID 0000000XXXXXXXXXX, Authentication success on Interface Gi1/0/24 AuditSessionID 0000000XXXXXXXXXX, Authentication not succeed on Interface Fi1/0/10 AuditSessionID 0000000XXXXXXXXXX"

我期待输出 Gi1 / 0/20 Gi1 / 0/24 Fi1 / 0/10

Answer 1

正则表达式适合此任务：

import re

text = 'Authentication success on Interface Gi1/0/20 AuditSessionID 0000000XXXXXXXXXX, Authentication success on Interface Gi1/0/24 AuditSessionID 0000000XXXXXXXXXX, Authentication not succeed on Interface Fi1/0/10 AuditSessionID 0000000XXXXXXXXXX'
re.findall('Interface (.*?) ', text)

re.findall()将返回一个包含您想要的内容的列表。

['Gi1/0/20', 'Gi1/0/24', 'Fi1/0/10']

模式'Interface (.*?) '的工作方式是匹配所有以单词“ Interface”开头的内容，后跟一个空格，然后是某物或什么都不是，然后是另一个空格。 (.*?)代表了前面提到的任何东西或什么都没有，re.findall()捕获（即，将其添加到.*?的输出中）与.匹配的任何字符（{{1} }），可以任意次数（*）进行匹配（?）。您可以在https://regex101.com/之类的网站上使用正则表达式，这将使您可以运行Python正则表达式并对其进行解释（比我更好）。

Answer 2

尚不清楚哪个属性定义了要提取的模式，但这是一个严格的正则表达式，它匹配一个大写字母，然后是一个小写字母，一个数字，一个斜杠，另一个数字，然后一个斜杠和两个数字。如果输入字符串中存在重复和其他字符，则可以轻松地将其扩展为包括重复和其他字符。

import re

s = "Authentication success on Interface Gi1/0/20 AuditSessionID 0000000XXXXXXXXXX, Authentication success on Interface Gi1/0/24 AuditSessionID 0000000XXXXXXXXXX, Authentication not succeed on Interface Fi1/0/10 AuditSessionID 0000000XXXXXXXXXX"

print(re.findall(r"[A-Z][a-z]\d/\d/\d\d", s))

输出：

['Gi1/0/20', 'Gi1/0/24', 'Fi1/0/10']

从一条长行中提取特定的字符串

2 个答案: