Question

我正在使用this script刮板游戏。

这工作正常，并且正在从xml data中获取信息

我想再向输出csv中提取一个元素。这个：

id是随机的，我想要的是type属性“ boardgamepublisher”的链接标记的值，并将它们添加到csv字段（最好是一个单元格中的所有boardgamepublishers）。有时是一个桌游发行商，有时更多。链接元素很多，所以我需要按它们进行过滤

Answer 1

 soup = BeautifulSoup(req.content, 'xml')
    items = soup.find_all('item')
    for item in items:

需要添加此代码以添加发布者

 publishers=item.find_all(type="boardgamepublisher" )
        gpublishers=""
        for publisher in publishers:
           gpublishers += publisher["value"]+","

第一行返回一个带有

的列表

   <link type="boardgamepublisher" id="1001" value="(Web published)"/>

和

   <link type="boardgamepublisher" id="1341" value="something else"/>

publisher [“ value”]提取Value属性的内容。

我仍在寻找更好的建议，因为我不满意这种解决方案会很慢。

美丽的汤：使用cerain属性提取xml值

1 个答案: