Question

我有这个xml：

<dc:type>image fixe</dc:type>
<dc:type>image</dc:type>
<dc:type>still image</dc:type>
<dc:type>dessin</dc:type>
<dc:type>drawing</dc:type>

我想要所有“ dc：type”标签的所有文本。我可以使用soup.find("dc:type").get_text()来获得第一个，但尝试时，例如：

for i in soup.find_all("dc:type"):
     type = "|".join(i.get_text())

它什么也没给我。同样，仅打印soup.find_all("dc:type")不会获得任何效果，而仅使用find进行打印似乎可以。我在做什么错了？

Answer 1

我不确定为什么它不能与您一起使用。我拥有所有价值。

from bs4 import BeautifulSoup

data='''<dc:type>image fixe</dc:type>
<dc:type>image</dc:type>
<dc:type>still image</dc:type>
<dc:type>dessin</dc:type>
<dc:type>drawing</dc:type>'''

soup=BeautifulSoup(data,'html.parser')
for item in soup.find_all('dc:type'):
 print(item.text)

输出：

image fixe
image
still image
dessin
drawing

您也可以使用lambda搜索标签名称。

from bs4 import BeautifulSoup

data='''<dc:type>image fixe</dc:type>
<dc:type>image</dc:type>
<dc:type>still image</dc:type>
<dc:type>dessin</dc:type>
<dc:type>drawing</dc:type>'''

soup=BeautifulSoup(data,'html.parser')
for item in soup.find_all(lambda tag:tag.name=='dc:type'):
 print(item.text)

使用beautifulsoup从xml获取具有不同内容的相同标签

1 个答案: