我正在尝试编写一个脚本,该脚本将通过文本文件检查特定内容并分配给变量。
例如:
文字文件内容:
eth0 Link encap:Ethernet HWaddr 08:ee:27:ff:b3:d7
inet addr:10.0.2.45 Bcast:10.3.2.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe00:b3d7/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:16178 errors:0 dropped:0 overruns:0 frame:0
TX packets:8559 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:14045795 (14.0 MB) TX bytes:1355632 (1.3 MB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:666 errors:0 dropped:0 overruns:0 frame:0
TX packets:666 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:72748 (72.7 KB) TX bytes:72748 (72.7 KB)
我想在接口eth0上检查'RX packets'的值,并为变量赋值'16178'。我需要能够从这个特定的块'eth0'中提取这个值。
请告知从哪里开始?
谢谢。
答案 0 :(得分:1)
如图所示,可以使用Regex轻松完成; eth0.*?
指定应提取与 eth0 相关的数据包,RX packets:
指定 RX数据包后的数字:需要提取并(\d)
组提取的数字。
>>> import re
>>> a="""eth0 Link encap:Ethernet HWaddr 08:ee:27:ff:b3:d7
... inet addr:10.0.2.45 Bcast:10.3.2.255 Mask:255.255.255.0
... inet6 addr: fe80::a00:27ff:fe00:b3d7/64 Scope:Link
... UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
... RX packets:16178 errors:0 dropped:0 overruns:0 frame:0
... TX packets:8559 errors:0 dropped:0 overruns:0 carrier:0
... collisions:0 txqueuelen:1000
... RX bytes:14045795 (14.0 MB) TX bytes:1355632 (1.3 MB)
...
... lo Link encap:Local Loopback
... inet addr:127.0.0.1 Mask:255.0.0.0
... inet6 addr: ::1/128 Scope:Host
... UP LOOPBACK RUNNING MTU:65536 Metric:1
... RX packets:666 errors:0 dropped:0 overruns:0 frame:0
... TX packets:666 errors:0 dropped:0 overruns:0 carrier:0
... collisions:0 txqueuelen:0
... RX bytes:72748 (72.7 KB) TX bytes:72748 (72.7 KB)"""
>>> re.search(r'eth0.*?RX packets:(\d+)',a,re.DOTALL).group(1)
'16178'
答案 1 :(得分:0)
您可以使用正则表达式来提取值。
尝试一种模式:
m = re.match("\W*RX packets[^:]*:(\d+)", line)
在正则表达式\d
表示数字,+
表示一个或多个。你想要匹配'文本中的一个或多个数字。括号意味着捕获找到的数字,这个数字应该在特定文本RX packets:
之后找到。
您可以在official doc.
中找到有关正则表达式的更多详细信息您的代码如下:
data= """
eth0 Link encap:Ethernet HWaddr 08:ee:27:ff:b3:d7
inet addr:10.0.2.45 Bcast:10.3.2.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe00:b3d7/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:16178 errors:0 dropped:0 overruns:0 frame:0
TX packets:8559 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:14045795 (14.0 MB) TX bytes:1355632 (1.3 MB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:666 errors:0 dropped:0 overruns:0 frame:0
TX packets:666 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:72748 (72.7 KB) TX bytes:72748 (72.7 KB)"""
import re
def findSeq(block,data):
isInRightBlock= False
for line in data.splitlines():
if block in line:
isInRightBlock= True
m = re.match("\W*RX packets[^:]*:(\d+)", line)
if m and isInRightBlock:
isInRightBlock= False
return m.group(1)
res= findSeq("eth0",data)
print res #Your Value
输出:
16178
<强> Banchemark 强>
from datetime import datetime
start_time_1 = datetime.now()
res= findSeq("eth0",data)
print('Duration: {}'.format(datetime.now() - start_time_1))
from datetime import datetime
start_time_2 = datetime.now()
re.search(r'eth0.*?RX packets:(\d+)',data,re.DOTALL).group(1)
print('Duration: {}'.format(datetime.now() - start_time_2))
输出
Duration: 0:00:00.000547
Duration: 0:00:00.000344
NT:您可以优化检查正确区块的方式。
答案 2 :(得分:0)
>>> re.findall('eth0.*?RX packets:(\d+)',x,re.DOTALL)
['16178']