我正在尝试使用Python解析AUTOSAR特定的arxml(类似于xml文件),但无法读取文件的内容。我想在多个DEFINITION-REF
标签内获取定义的ECUC-CONTAINER-VALUE
值,例如:
/AUTOSAR/ecucdef/BswM/BswMConfig/BswMArbitration/BswMLogicalExpression/BswMArgumentRef
我尝试了多种方法,但是无法打印出内容。
from bs4 import BeautifulSoup as Soup
def parseArxml():
handler = open('input.arxml').read()
soup = Soup(handler,"html.parser")
for ecuc_container in soup.findAll('ECUC-CONTAINER-VALUE'):
print(ecuc_container)
if __name__ == "__main__":
parseArxml()
这是arxml文件的一部分:
<?xml version="1.0" encoding="UTF-8"?>
<AUTOSAR xmlns="http://autosar.org/schema/r4.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://autosar.org/schema/r4.0 autosar_4-2-1.xsd">
<ECUC-CONTAINER-VALUE UUID="c112c504-e546-41c3-abf9-0aaf06b18284">
<SHORT-NAME>BswMLogicalExpression_2</SHORT-NAME>
<DEFINITION-REF DEST="ECUC-PARAM-CONF-CONTAINER-DEF">/AUTOSAR/ecucdef/BswM/BswMConfig/BswMArbitration/BswMLogicalExpression</DEFINITION-REF>
<REFERENCE-VALUES>
<ECUC-REFERENCE-VALUE>
<DEFINITION-REF DEST="ECUC-CHOICE-REFERENCE-DEF">/AUTOSAR/ecucdef/BswM/BswMConfig/BswMArbitration/BswMLogicalExpression/BswMArgumentRef</DEFINITION-REF>
<VALUE-REF DEST="ECUC-CONTAINER-VALUE">/ARRoot/BswM_0/BswMConfig_0/BswMArbitration_0/BswMModeCondition_2</VALUE-REF>
</ECUC-REFERENCE-VALUE>
</REFERENCE-VALUES>
</ECUC-CONTAINER-VALUE>
<ECUC-CONTAINER-VALUE UUID="c112c504-e546-41c3-abf9-0aaf06b18284">
<SHORT-NAME>BswMLogicalExpression_3</SHORT-NAME>
<DEFINITION-REF DEST="ECUC-PARAM-CONF-CONTAINER-DEF">/AUTOSAR/ecucdef/BswM/BswMConfig/BswMArbitration/BswMLogicalExpression</DEFINITION-REF>
<REFERENCE-VALUES>
<ECUC-REFERENCE-VALUE>
<DEFINITION-REF DEST="ECUC-CHOICE-REFERENCE-DEF">/AUTOSAR/ecucdef/BswM/BswMConfig/BswMArbitration/BswMLogicalExpression/BswMArgumentRef</DEFINITION-REF>
<VALUE-REF DEST="ECUC-CONTAINER-VALUE">/ARRoot/BswM_2/BswMConfig_2/BswMArbitration_2/BswMModeCondition_3</VALUE-REF>
</ECUC-REFERENCE-VALUE>
</REFERENCE-VALUES>
</ECUC-CONTAINER-VALUE>
</AUTOSAR>
答案 0 :(得分:0)
似乎您的解析器和BeautifulSoup版本正在将标签转换为小写。
您应该这样做:
from bs4 import BeautifulSoup as Soup
def parseArxml():
handler = open('input.arxml').read()
soup = Soup(handler,"html.parser")
for ecuc_container in soup.find_all('ecuc-container-value'):
for def_ref in ecuc_container.find_all('definition-ref'):
print(def_ref.get_text())
if __name__ == "__main__":
parseArxml()
输出:
/AUTOSAR/ecucdef/BswM/BswMConfig/BswMArbitration/BswMLogicalExpression
/AUTOSAR/ecucdef/BswM/BswMConfig/BswMArbitration/BswMLogicalExpression/BswMArgumentRef
/AUTOSAR/ecucdef/BswM/BswMConfig/BswMArbitration/BswMLogicalExpression
/AUTOSAR/ecucdef/BswM/BswMConfig/BswMArbitration/BswMLogicalExpression/BswMArgumentRef
答案 1 :(得分:0)
您将看到print(soup)
,标记名已由解析器转换为小写。因此,在搜索标签名称时使用小写字母:
for ecuc_container in soup.findAll('ECUC-CONTAINER-VALUE'.lower()):
或简单地:
for ecuc_container in soup.findAll('ecuc-container-value'):
甚至更好:将文档显式解析为XML,以便不更改标签的大小写:
soup = Soup(handler,'xml')
以下是您获取<DEFINITION-REF DEST="ECUC-PARAM-CONF-CONTAINER-DEF">
元素内的文本列表的方法:
def parseArxml():
handler = open('input.arxml').read()
soup = Soup(handler,'xml')
dest = [d.text for d in soup.findAll('DEFINITION-REF') if d['DEST']=='ECUC-CHOICE-REFERENCE-DEF']
print(dest)
输出:
['/AUTOSAR/ecucdef/BswM/BswMConfig/BswMArbitration/BswMLogicalExpression/BswMArgumentRef',
'/AUTOSAR/ecucdef/BswM/BswMConfig/BswMArbitration/BswMLogicalExpression/BswMArgumentRef']
或者,如果您希望获得所有definition-ref
标签,而不论其属性如何,请使用
dest = [d.text for d in soup.findAll('definition-ref')]