用python解析arxml文件

时间:2019-08-13 15:03:38

标签: python xml xpath xml-parsing

我有一个文件需要解析,我需要信息。解析的顺序很重要。

  1. 我可以解析文件并获取信息,但不能按顺序显示。
  2. 如何解析信息<MSR-QUERY-ARG SI="HtmlAnchor">

顺便说一句:我在哪里可以上传arxml文件?

文件下载:ARXML-FILE

from xml.etree import ElementTree as ET
import csv

fpath = "test.arxml"

tree = ET.parse(fpath)
root = tree.getroot()

ns = {'ns':'http://autosar.org/schema/r4.0'}

for arpackage in tree.findall('.//ns:CHAPTER/ns:TRACE',namespaces=ns):
    print(arpackage.findall('.//ns:SHORT-NAME', namespaces=ns)[0].text)

for arpackage in tree.findall('.//ns:CHAPTER/ns:MSR-QUERY-P-1', namespaces=ns):
    print(arpackage.findall('.//ns:MSR-QUERY-ARG', namespaces=ns)[0].text)

1 个答案:

答案 0 :(得分:0)

另一种方法。

from simplified_scrapy import SimplifiedDoc, utils, req
html = utils.getFileContent('test.arxml')
doc = SimplifiedDoc(html)
names = doc.selects('TRACE').selects('SHORT-NAME>text()')
msrs = doc.selects('MSR-QUERY-P-1').select('MSR-QUERY-ARG@SI="HtmlAnchor">text()')
print (names)
print (msrs)

结果:

[['S_001'], ['S_002'], ['S_003'], ['S_004'], ['S_005'], ['S_006'], ['S_007'], ['S_008'], ['S_009'], ['S_010'], ['S_011'], ['S_012'], ['S_013'], ['S_014'], ['S_015'], ['S_016'], ['S_017'], ['S_018'], ['S_019'], ['S_020'], ['S_021'], ['S_022'], ['S_023'], ['S_024'], ['S_025'], ['S_026'], ['S_027'], ['S_028'], ['S_029'], ['S_030'], ['S_031'], ['S_032'], ['S_033'], ['S_034'], ['S_035'], ['S_036'], ['S_037'], ['S_038'], ['S_039']]
['AAA_001', 'AAA_002', 'AAA_003']

还有更多示例,包括解析和更新:https://github.com/yiyedata/simplified-scrapy-demo/tree/master/doc_examples