我无法解析这种xml文件:
<items>
<item>
<name>Car</name>
<description>
<specification>
<color>blue</color>
</specification>
<specification>
<color>yellow</color>
</specification>
</description>
<item>
<items>
我希望恢复所有用逗号分隔的颜色。
我是python的初学者。
items = doc.getElementsByTagName("items")
for item in items:
name = item.getAttribute("name")
color = item.getElementByTagName("color")[0]
print(name,color.firstChild.data)
谢谢。
答案 0 :(得分:0)
我会推荐BeautifulSoup
from bs4 import BeautifulSoup
a='''<items>
<item>
<name>Car</name>
<description>
<specification>
<color>blue</color>
</specification>
<specification>
<color>yellow</color>
</specification>
</description>
<item>
<items>'''
color_list=[]
soup = BeautifulSoup(a, "html.parser")
for i in soup.findAll('color'):
color_list.append(i.next_element)
print(','.join(color_list)) # blue,yellow
答案 1 :(得分:0)
谢谢!它适用于这种情况,但是对于较大的示例,我无法做到..
<TradeMark>
<MarkImageDetails>
<MarkImage>
<MarkImageFilename>FMARK0000000004393852</MarkImageFilename>
<MarkImageFileFormat>TIFF</MarkImageFileFormat>
</MarkImage>
</MarkImageDetails>
<GoodsServicesDetails>
<GoodsServices>
<ClassificationKindCode>Nice</ClassificationKindCode>
<ClassDescriptionDetails>
<ClassDescription>
<ClassNumber>35</ClassNumber>
</ClassDescription>
<ClassDescription>
<ClassNumber>41</ClassNumber>
</ClassDescription>
<ClassDescription>
<ClassNumber>42</ClassNumber>
</ClassDescription>
</ClassDescriptionDetails>
</GoodsServices>
</GoodsServicesDetails>
</TradeMark>
我希望使用ClassNumber。