我有xml.file如下所示,我试图用python minidom解析它,但是有一些问题。我想提取一些属性<ManagedElementId string
或<associatedSite string = "Site=site00972"/>
,但没有运气。在互联网上使用python minidom教程,我没有设法做到这一点,所以我需要你帮助告诉我如何做到这一点。这是我的尝试:
#!/usr/bin/python
import os
import xml.dom.minidom
from xml.dom import minidom
from xml.dom.minidom import parseString,parse
from xml.dom.minidom import Node
xmldoc = minidom.parse("proba.xml")
model= xmldoc.getElementsByTagName('ManagedElementId string = ')
for node in model:
print node.firstChild.nodeValue
我希望在字符串之间获得价值。
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE Model SYSTEM "/opt/ericsson/arne/etc/arne12_2.dtd">
<Model version = "1" importVersion = "12.2">
<!--Validate: /opt/ericsson/arne/bin/import.sh -f 4_siu_create.xml \ -val:rall -->
<Create>
<SubNetwork userLabel = "ZLNOUR_SIU" networkType = "IPRAN">
<ManagedElement sourceType = "SIU">
<ManagedElementId string = "siu009722"/>
<primaryType type = "STN"/>
<managedElementType types = ""/>
<associatedSite string = "Site=site00972"/>
<nodeVersion string = "T11A"/>
<platformVersion string = ""/>
<swVersion string = ""/>
<vendorName string = ""/>
<userDefinedState string = ""/>
<managedServiceAvailability int = "1"/>
<isManaged boolean = "true"/>
<connectionStatus string = "OFF"/>
<Connectivity>
<DEFAULT>
<emUrl url = "http://10.131.203.117:80/"/>
<ipAddress string = "10.131.203.117"/>
<oldIpAddress string = "int dummy=0"/>
<hostname string = ""/>
<nodeSecurityState state = "ON"/>
<boardId string = ""/>
<Protocol number = "0">
<protocolType string = "SNMP"/>
<port int = "161"/>
<protocolVersion string = "v2c"/>
<securityName string = ""/>
<authenticationMethod string = ""/>
<encryptionMethod string = ""/>
<communityString string = "public"/>
<context string = ""/>
<namingUrl string = ""/>
<namingPort int = ""/>
<notificationIRPAgentVersion string = ""/>
<alarmIRPAgentVersion string = ""/>
<notificationIRPNamingContext context = ""/>
<alarmIRPNamingContext context = ""/>
</Protocol>
<Protocol number = "1">
<protocolType string = "SSH"/>
<port int = "22"/>
<protocolVersion string = ""/>
<securityName string = ""/>
<authenticationMethod string = ""/>
<encryptionMethod string = ""/>
<communityString string = ""/>
<context string = ""/>
<namingUrl string = ""/>
<namingPort int = ""/>
<notificationIRPAgentVersion string = ""/>
<alarmIRPAgentVersion string = ""/>
<notificationIRPNamingContext context = ""/>
<alarmIRPNamingContext context = ""/>
</Protocol>
<Browser>
<browser string = ""/>
<browserURL string = ""/>
<bookname string = ""/>
</Browser>
</DEFAULT>
</Connectivity>
<Tss>
<Entry>
<System string = "siu009722"/>
<Type string = "NORMAL"/>
<User string = "admin"/>
<Password string = "siu009722"/>
</Entry>
<Entry>
<System string = "siu009722"/>
<Type string = "SECURE"/>
<User string = "admin"/>
<Password string = "siu009722"/>
</Entry>
</Tss>
<Relationship>
<AssociableNode TO_FDN = "FtpServer=SMRSSLAVE-rtwaned1o,FtpService=swstore-rtwaned1o" AssociationType = "ManagedElement_to_ftpSwStore"/>
<AssociableNode TO_FDN = "FtpServer=SMRSSLAVE-rtwaned1o,FtpService=cmdown-rtwaned1o" AssociationType = "ManagedElement_to_neTransientCmDown"/>
<AssociableNode TO_FDN = "FtpServer=SMRSSLAVE-rtwaned1o,FtpService=cmup-rtwaned1o" AssociationType = "ManagedElement_to_neTransientCmUp"/>
<AssociableNode TO_FDN = "FtpServer=SMRSSLAVE-rtwaned1o,FtpService=pmup-rtwaned1o" AssociationType = "ManagedElement_to_neTransientPm"/>
<AssociableNode TO_FDN = "ManagementNode=ONRM" AssociationType = "MgmtAssociation"/>
<AssociableNode TO_FDN = "SubNetwork=ZLNOUR3,MeContext=rbs009721,ManagedElement=1,NodeBFunction=1" FROM_FDN = "SubNetwork=ZLNOUR_SIU,ManagedElement=siu009722,StnFunction=STN_ManagedFunction" AssociationType = "StnFunction_to_NodeBFunction"/>
</Relationship>
</ManagedElement>
</SubNetwork>
</Create>
</Model>
答案 0 :(得分:2)
您在标记名称中包含属性名称:
model= xmldoc.getElementsByTagName('ManagedElementId string = ')
string =
部分不属于标签名称;您的文档中没有此类标记。删除string =
部分:
>>> from xml.dom import minidom
>>> tree = minidom.parseString(sample)
>>> tree.getElementsByTagName('ManagedElementId')
[<DOM Element: ManagedElementId at 0x1080baef0>]
此元素没有子节点;它只有一个属性值:
>>> node = tree.getElementsByTagName('ManagedElementId')[0]
>>> node.firstChild is None
True
>>> node.getAttribute('string')
u'siu009722'
但我强烈建议您远离XML DOM;你最好使用更简单的ElementTree API:
>>> from xml.etree import ElementTree as ET
>>> tree = ET.fromstring(sample)
>>> tree.find('.//ManagedElementId')
<Element 'ManagedElementId' at 0x1080af950>
>>> tree.find('.//ManagedElementId').get('string')
'siu009722'