我需要使用正则表达式re.findall()或re.multiline()来解析此库存日志中的任何数字有什么建议吗?这是我到目前为止所拥有的。
以下是库存日志的示例:
Processor : Intel(R) Xeon(R) CPU E5-2420 0 @ 1.90GHz (24 cores/threads)
Memory : 65493MB
Controller Slot : 0
BIOS : 3.0b 05/06/2014 3.2
mpt2sas0: LSISAS2308: FWVersion(16.00.01.00), ChipRevision(0x05), BiosVersion(07.33.00.00)
mpt2sas1: LSISAS2308: FWVersion(17.00.01.00), ChipRevision(0x05), BiosVersion(07.33.00.00)
compute node vpd: NA;NA;NA;
IPMI FW rev : 2.29
Chassis Type : Other
Chassis Part Number : CSE-927ETS-R000NDBP
Chassis Serial : C92700325A00092
Board Mfg Date : Sun Dec 31 19:00:00 1995
Board Mfg : Supermicro
Board Product : IPMI 2.0
Board Serial : OM13BS013020
Board Part Number : X9DBS-F(-2U)
Product Manufacturer : Supermicro
Product Name : IPMI 2.0
Product Part Number : SSG-2027B-DE2R24L
Product Version :
Product Serial : S13592923B11809
PCI Riser Card:
81:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2308 PCI-Express
Fusion-MPT SAS-2 (rev 05)
83:00.0 Ethernet controller: Chelsio Communications Inc T420-BT Unified Wire Ethernet
Controller
83:00.1 Ethernet controller: Chelsio Communications Inc T420-BT Unified Wire Ethernet
Controller
83:00.2 Ethernet controller: Chelsio Communications Inc T420-BT Unified Wire Ethernet
Controller
83:00.3 Ethernet controller: Chelsio Communications Inc T420-BT Unified Wire Ethernet
Controller
83:00.4 Ethernet controller: Chelsio Communications Inc T420-BT Unified Wire Ethernet
Controller
83:00.5 SCSI storage controller: Chelsio Communications Inc T420-BT Unified Wire Storage
Controller
83:00.6 Fibre Channel: Chelsio Communications Inc T420-BT Unified Wire Storage Controller
83:00.7 Ethernet controller: Chelsio Communications Inc Device 0000
-Hardware information
+Ethernet configuration
Chelsio T420-BT Card
version: 2.8.0.0
firmware-version: 1.9.23.0
plxnic0 00:10:b5:87:b0:01
eth3 00:07:43:15:fb:68
eth2 00:07:43:15:fb:60
eth1 00:25:90:8c:3a:23
eth0 00:25:90:8c:3a:22
bmc1 00:25:90:8c:15:2d 10.40.32.36
-Ethernet configuration
+Firmware Versions
, you are running a release image
This sdi release build was done by build on Sun Jul 13 2014 16:51:58
From /slave/jenkins/workspace/sdi_rls/dgcode
With git rev: e7fc81503edb567205e284cadef35f6bc5d0b7e6
import re
with warnings.catch_warnings():
warnings.simplefilter("ignore")
import sys
sys.path.append("/home/build/sars")
def rescanips():
data = {}
fileIN = open(sys.argv[1], 'r')
line = fileIN.readline()
for line in file_obj:
if ':' in line:
pos = line.index(':')
data[line[:pos].strip()] = line[pos + 1:].strip()
for key in data: print key, ':', data[x]
if key == "Processor":
if data[x] != "Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz (24 cores/threads)":
sys.stderr.write("Should be Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz (24 cores/threads) but it is" + data[x] "\n")
if key == "Memory":
if data[x] != "81877MB":
sys.stderr.write("Error in memory: should be 81877MB it is currently" + data[x] + "\n")
if key == "Controller Slot":
if data[x] != "0" or "1":
sys.stderr.write("Invalid controller slot either should be 0 or 1 it is" + data[x] + "\n")
if key == "BIOS":
if data[x] != "3.0b 5/6/14 3.1":
sys.stderr.write("The BIOS must be updated to 3.0b 5/6/14 3.1 it is currently" + data[x] + "\n")
if key == "Canister Firmware":
if data[x] != "3.5.0.20":
sys.stderr.write("The Canister Firmware must updated to 3.5.0.20 it is currently" + data[x] + "\n")
f.close()
答案 0 :(得分:0)
不要使用正则表达式进行解析。除了要求上下文的PCRE版本外,它们还是slow when used like this。您是否考虑过使用Ply?你正在努力推动自己的状态机,以及其中的痛苦。
我有一个sample ply parser你可以用来开始。了解它如何在lexer.py
文件中使用简单的正则表达式来定义令牌,这些令牌将被传递到parser.py
文件。解析器定义令牌的有效顺序,以及如何处理令牌集。
这允许您在解析器中找到匹配时执行代码,在词法分析器和中确定上下文时。例如,您在词法分析器中找到一个数字,并将其传递给解析器。解析器看到数字标记位于处理器标记之后,并将其记录为相关数据。