我有一个xml文件。我试图按照下面所示的常规方式读取它
def xmlfilereadread(self,path):
doc = minidom.parse(path)
Account = doc.getElementsByTagName("sf:ReceiverSet")[0]
num = Account.getAttribute('totalNo')
aList = []
for i in range(int(num)):
print(i)
AccountReference = doc.getElementsByTagName("sf:Receiver")[i]
但是我需要使用panda代替此代码。如何读取数据。我的示例xml代码是
<?xml version="1.0" encoding="UTF-8"?>
<sf:IFile xmlns:sf="http://www.canadapost.ca/smartflow" sequenceNo="10">
<sf:ReceiverSet documentTypes="TAXBILL" organization="lincolntax" totalNo="3">
<sf:Receiver sequenceNo="1" correlationID="1114567890123456789">
<sf:AccountReference>11145678901234567891111</sf:AccountReference>
<sf:SubscriptionAuth> <sf:ParamSet>
<sf:Param name="auth1">1114567890123456789</sf:Param>
<sf:Param name="auth2">CARTER, JOE</sf:Param> </sf:ParamSet>
</sf:SubscriptionAuth>
</sf:Receiver> <sf:Receiver sequenceNo="2" correlationID="2224567890123456789">
<sf:AccountReference>22245678901234567892222</sf:AccountReference> <sf:SubscriptionAuth> <sf:ParamSet>
<sf:Param name="auth1">2224567890123456789</sf:Param>
<sf:Param name="auth2">DOE, JANE</sf:Param> </sf:ParamSet>
</sf:SubscriptionAuth> </sf:Receiver> <sf:Receiver sequenceNo="3" correlationID="3334567890123456789">
<sf:AccountReference>33345678901234567893333</sf:AccountReference> <sf:SubscriptionAuth> <sf:ParamSet>
<sf:Param name="auth1">3334567890123456789</sf:Param> <sf:Param name="auth2">SOZE, KEYSER</sf:Param>
</sf:ParamSet> </sf:SubscriptionAuth> </sf:Receiver> </sf:ReceiverSet> </sf:IFile>
答案 0 :(得分:0)
XML是一种固有的分层数据格式,也是最自然的 表示它的方法是用树。 ET为此有两个类别 目的-ElementTree将整个XML文档表示为一棵树,并且 元素表示此树中的单个节点。与之互动 整个文档(读写文件)通常在 ElementTree级别。与单个XML元素及其交互 子元素在元素级别完成
。
import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml')
root = tree.getroot()
或者您可以使用lxml
从lxml导入etree
root = etree.parse(r'local-path-to-.xml')
print (etree.tostring(root))