我已使用BS4
抓取了以下html,但似乎无法搜索艺术家代码。
我已将此代码块分配给名为容器的变量,然后尝试
print container.tr.td["artist"]
没有运气。
有什么建议值得赞赏吗?
<tr class="item">
<!-- <td class="image"><a href="https://www.stargreen.com/kool-as-the-gang-44415.html" title="KOOL AS THE GANG " class="product-image"><img src="https://www.stargreen.com/media/catalog/product/cache/1/small_image/135x/9df78eab33525d08d6e5fb8d27136e95/K/o/KoolAsTheGang.jpg" width="135" height="135" alt="KOOL AS THE GANG " /></a></td> -->
<td class="date">Sat, 30 Dec 2017</td>
<td class="artist">kool as the gang</td>
<td class="venue">100 club</td>
<td class="link">
<p class="availability out-of-stock">
<span>Off Sale</span></p>
</td>
</tr>
答案 0 :(得分:5)
你的语法错了,&#34;艺术家&#34;是&#34;类&#34;的价值。属性试试这个:
[WARNING]: Consider using yum module rather than running yum
输出:
from bs4 import BeautifulSoup
html = """
<tr class="item">
<!-- <td class="image"><a href="https://www.stargreen.com/kool-as-the-gang-44415.html" title="KOOL AS THE GANG " class="product-image"><img src="https://www.stargreen.com/media/catalog/product/cache/1/small_image/135x/9df78eab33525d08d6e5fb8d27136e95/K/o/KoolAsTheGang.jpg" width="135" height="135" alt="KOOL AS THE GANG " /></a></td> -->
<td class="date">Sat, 30 Dec 2017</td>
<td class="artist">
kool as the gang </td>
<td class="venue">100 club</td>
<td class="link">
<p class="availability out-of-stock">
<span>Off Sale</span></p>
</td>
</tr>
"""
soup = BeautifulSoup(html, 'html.parser')
td = soup.find('td',{'class': 'artist'})
print (td.text.strip())
答案 1 :(得分:2)
另一种方式。
使用container
方法查找class
select
为'艺术家'的text
元素。由于可能有多个,但您知道只有一个,请选择列表中唯一的元素,并请求其>>> HTML = open('sven.htm').read()
>>> import bs4
>>> container = bs4.BeautifulSoup(HTML, 'lxml')
>>> container.select('.artist')[0].text
'\n kool as the gang '
属性。
curl -H "Content-Type: application/json" -X POST -d '{"fieldOne": 9000, "fieldTwo": 5}' http://localhost:8000/foos