用于从SUBSTRING中获取INT的正则表达式

时间:2017-10-16 19:33:00

标签: regex python-3.x

IList<IWebElement> rowDeleteButtons = driver.findElements(By.XPath("//a[contains(@id,'delete')]");

rowDeleteButtons[0].Click();

此代码目前需要获取2个IPv4地址和数据包的长度,并将它们转换为2-d列表。到目前为止,我的正则表达式的前半部分使用IPv4地址。我的问题归结为抓住长度。我得到了输出:

import re data = [] tcp_dump = "17:18:38.877517 IP 192.168.0.15.43471 > 23.195.155.202.443: Flags [.], ack 1623866279, win 245, options [nop,nop,TS val 43001536 ecr 287517202], length 0" regex = r'(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})|(^length (\d+))' data_ready = re.findall(regex, tcp_dump) print(data_ready) data.append(data_ready) print(data)

而不是所需的输出:

[('192.168.0.15', '', ''), ('23.195.155.202', '', '')]

任何修复正则表达式的方法?

修改

所以事实证明,正则表达式分离的工作(只是上半部分或仅仅是下半部分),我似乎无法将它们合并起来。

2 个答案:

答案 0 :(得分:0)

这应该这样做。您只需要使一些括号不捕获并进行一些数据清理

import re
data = []

tcp_dump = "17:18:38.877517 IP 192.168.0.15.43471 > 23.195.155.202.443: Flags [.], ack 1623866279, win 245, options [nop,nop,TS val 43001536 ecr 287517202], length 0"

regex = r'(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})|(?:length (\d+))'

# make the returned tuples into two lists, one containing the IPs and the 
# other containing the lengths. Finally, filter out empty strings.
data_ready,lengths = zip(*re.findall(regex, tcp_dump))
list_data = [ip for ip in list(data_ready) + list(lengths) if ip != '']
print(list_data)
data.append(list_data)

print(data)

输出:

['192.168.0.15', '23.195.155.202', '0']

答案 1 :(得分:0)

我不会称之为IP地址匹配(因为192.168.0.15.43471不是有效的IP地址),而是文本解析/处理。
具有re.search()功能的优化解决方案:

import re

tcp_dump = "17:18:38.877517 IP 192.168.0.15.43471 > 23.195.155.202.443: Flags [.], ack 1623866279, win 245, options [nop,nop,TS val 43001536 ecr 287517202], length 0"
result = re.search(r'((?:\d{1,3}\.){3}\d{1,3})(?:\.\d+) > ((?:\d{1,3}\.){3}\d{1,3})(?:\.\d+).*(\d+)$', tcp_dump)
result = list(result.groups())

print(result)

输出:

['192.168.0.15', '23.195.155.202', '0']