我有一个非常复杂的XML文档,至少对我来说,上面有一些信息,我尝试检查lxml库中的任务,但遇到了困难。
我拥有的XML文档非常类似于以下内容:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="MeasDataCollection.xsl"?>
<measCollecFile
xmlns="http://www.3gpp.org/ftp/specs/archive/32_series/32.435#measCollec">
<fileHeader fileFormatVersion="32.435 V8.0.0"
vendorName="Nokia">
<fileSender
localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
elementType="pgw instance 1" />
<measCollec beginTime="2019-05-14T12:00:01-03:00" />
</fileHeader>
<measData>
<managedElement
localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
swVersion="C-10.0.R9" />
<measInfo measInfoId="KPISystemCP-ISA">
<granPeriod duration="PT300S" endTime="2019-05-14T12:05:01-03:00" />
<measType p="1">VS.avgCpuUtilization</measType>
<measType p="2">VS.avgMemoryUtilization</measType>
<measType p="3">VS.avgMemoryUtilization1M</measType>
<measType p="4">VS.SDFsFpUtilization</measType>
<measType p="5">VS.SDFsLcpUtilization</measType>
<measType p="6">VS.avgVmFpCpuNicUsage</measType>
<measType p="7">VS.avgVmFpCpuWorkerUsage</measType>
<measType p="8">VS.avgVmFpCpuSchedulerUsage</measType>
<measType p="9">VS.avgVmFpCpuCollapsedUsage</measType>
<measType p="10">VS.avgVmFpCpuCombinedUsage</measType>
<measType p="11">VS.hwCfgBitsInfo</measType>
<measValue measObjLdn="KPI=System,GroupName=CP-ISA,group=1,slot=3,mda=1">
<r p="1">1</r>
<r p="2">72</r>
<r p="3">72</r>
<r p="4">0.00</r>
<r p="5">0.00</r>
<r p="6">0.00</r>
<r p="7">0.05</r>
<r p="8">0.00</r>
<r p="9">0.00</r>
<r p="10">0.00</r>
<r p="11">4</r>
<suspect>false</suspect>
</measValue>
</measInfo>
我想知道如何使用python访问VS.avgMemoryUtilization1M的值。
从外观上我知道VS.avgMemoryUtilization1M的值为72,但是如何使用lxml库从python访问它呢?
答案 0 :(得分:0)
您可以使用BeautifulSoup
来解析XML数据(优点是可以使用CSS选择器,XML格式可能不正确,等等):
from bs4 import BeautifulSoup
data = ''' <?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="MeasDataCollection.xsl"?>
<measCollecFile
xmlns="http://www.3gpp.org/ftp/specs/archive/32_series/32.435#measCollec">
<fileHeader fileFormatVersion="32.435 V8.0.0"
vendorName="Nokia">
<fileSender
localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
elementType="pgw instance 1" />
<measCollec beginTime="2019-05-14T12:00:01-03:00" />
</fileHeader>
<measData>
<managedElement
localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
swVersion="C-10.0.R9" />
<measInfo measInfoId="KPISystemCP-ISA">
<granPeriod duration="PT300S" endTime="2019-05-14T12:05:01-03:00" />
<measType p="1">VS.avgCpuUtilization</measType>
<measType p="2">VS.avgMemoryUtilization</measType>
<measType p="3">VS.avgMemoryUtilization1M</measType>
<measType p="4">VS.SDFsFpUtilization</measType>
<measType p="5">VS.SDFsLcpUtilization</measType>
<measType p="6">VS.avgVmFpCpuNicUsage</measType>
<measType p="7">VS.avgVmFpCpuWorkerUsage</measType>
<measType p="8">VS.avgVmFpCpuSchedulerUsage</measType>
<measType p="9">VS.avgVmFpCpuCollapsedUsage</measType>
<measType p="10">VS.avgVmFpCpuCombinedUsage</measType>
<measType p="11">VS.hwCfgBitsInfo</measType>
<measValue measObjLdn="KPI=System,GroupName=CP-ISA,group=1,slot=3,mda=1">
<r p="1">1</r>
<r p="2">72</r>
<r p="3">72</r>
<r p="4">0.00</r>
<r p="5">0.00</r>
<r p="6">0.00</r>
<r p="7">0.05</r>
<r p="8">0.00</r>
<r p="9">0.00</r>
<r p="10">0.00</r>
<r p="11">4</r>
<suspect>false</suspect>
</measValue>
</measInfo>'''
soup = BeautifulSoup(data, 'xml')
p = soup.select_one('measType[p]:contains("VS.avgMemoryUtilization1M")')['p']
print('Value of `VS.avgMemoryUtilization1M`={}'.format(soup.select_one('r[p="{}"]'.format(p)).text))
打印:
Value of `VS.avgMemoryUtilization1M`=72
答案 1 :(得分:0)
使用python xml.etree.ElementTree
import xml.etree.ElementTree as ET
import re
data = '''<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="MeasDataCollection.xsl"?>
<measCollecFile
xmlns="http://www.3gpp.org/ftp/specs/archive/32_series/32.435#measCollec">
<fileHeader fileFormatVersion="32.435 V8.0.0"
vendorName="Nokia">
<fileSender
localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
elementType="pgw instance 1" />
<measCollec beginTime="2019-05-14T12:00:01-03:00" />
</fileHeader>
<measData>
<managedElement
localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
swVersion="C-10.0.R9" />
<measInfo measInfoId="KPISystemCP-ISA">
<granPeriod duration="PT300S" endTime="2019-05-14T12:05:01-03:00" />
<measType p="1">VS.avgCpuUtilization</measType>
<measType p="2">VS.avgMemoryUtilization</measType>
<measType p="3">VS.avgMemoryUtilization1M</measType>
<measType p="4">VS.SDFsFpUtilization</measType>
<measType p="5">VS.SDFsLcpUtilization</measType>
<measType p="6">VS.avgVmFpCpuNicUsage</measType>
<measType p="7">VS.avgVmFpCpuWorkerUsage</measType>
<measType p="8">VS.avgVmFpCpuSchedulerUsage</measType>
<measType p="9">VS.avgVmFpCpuCollapsedUsage</measType>
<measType p="10">VS.avgVmFpCpuCombinedUsage</measType>
<measType p="11">VS.hwCfgBitsInfo</measType>
<measValue measObjLdn="KPI=System,GroupName=CP-ISA,group=1,slot=3,mda=1">
<r p="1">1</r>
<r p="2">72</r>
<r p="3">72</r>
<r p="4">0.00</r>
<r p="5">0.00</r>
<r p="6">0.00</r>
<r p="7">0.05</r>
<r p="8">0.00</r>
<r p="9">0.00</r>
<r p="10">0.00</r>
<r p="11">4</r>
<suspect>false</suspect>
</measValue>
</measInfo>
</measData>
</measCollecFile>
'''
data = re.sub(' xmlns="[^"]+"', '', data, count=1)
root = ET.fromstring(data)
# look for measType at offset 3 and take its p val
p_val = root.find('.//measType[3]').attrib['p']
print(root.find(".//r/[@p='{}']".format(p_val)).text)
输出
72