首先,我是R程序员。我的团队需要将我的RScript翻译成Python,以便从XML中提取一些数据并将其转换为JSON。
根据文档,特别是这个答案:
我做了以下事情:
选项1
import xml.etree.ElementTree
e = xml.etree.ElementTree.parse('boleta1A.xml').getroot()
for atype in e.findall('cbc:ID'):
print(atype.text)
获得任何结果。
选项2
import xml.etree.ElementTree as ET
tree = ET.parse('boleta1A.xml')
root = tree.getroot()
root.findall("./sac:AdditionalMonetaryTotal/cbc:ID").text
Here, I'm getting:
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2862, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-29-2eefd6f96456>", line 5, in <module>
root.findall("./sac:AdditionalMonetaryTotal/cbc:ID").text
File "C:\ProgramData\Anaconda3\lib\xml\etree\ElementPath.py", line 304, in findall
return list(iterfind(elem, path, namespaces))
File "C:\ProgramData\Anaconda3\lib\xml\etree\ElementPath.py", line 283, in iterfind
token = next()
File "C:\ProgramData\Anaconda3\lib\xml\etree\ElementPath.py", line 83, in xpath_tokenizer
raise SyntaxError("prefix %r not found in prefix map" % prefix)
File "<string>", line unknown
SyntaxError: prefix 'sac' not found in prefix map
在这里,我认为我需要添加命名空间,但我不能很好地理解文档中的原因和内容:
https://docs.python.org/2.7/library/xml.etree.elementtree.html#parsing-xml-with-namespaces
XML文件:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:ccts="urn:un:unece:uncefact:documentation:2" xmlns:ds="http://www.w3.org/2000/09/xmldsig#" xmlns:ext="urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2" xmlns:qdt="urn:oasis:names:specification:ubl:schema:xsd:QualifiedDatatypes-2" xmlns:sac="urn:sunat:names:specification:ubl:peru:schema:xsd:SunatAggregateComponents-1" xmlns:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2 ..\xsd\maindoc\UBLPE-Invoice-2.0.xsd" xmlns:udt="urn:un:unece:uncefact:data:specification:UnqualifiedDataTypesSchemaModule:2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ext:UBLExtensions>
<ext:UBLExtension>
<ext:ExtensionContent>
<sac:AdditionalInformation>
<sac:AdditionalMonetaryTotal>
<cbc:ID>1001</cbc:ID>
<cbc:PayableAmount currencyID="PEN">388.3</cbc:PayableAmount>
</sac:AdditionalMonetaryTotal>
<sac:AdditionalProperty>
<cbc:ID>1000</cbc:ID>
<cbc:Value><![CDATA[CUATROCIENTOS SESENTA Y UN CON 56 /100 NUEVOS SOLES]]></cbc:Value>
</sac:AdditionalProperty>
</sac:AdditionalInformation>
</ext:ExtensionContent>
</ext:UBLExtension>
</ext:UBLExtensions>
</Invoice>
加:我使用的是Jupyter笔记本,你会推荐这个吗?或者,在python世界中,还有更类似于RStudio的东西?
谢谢!