Question

我对python和beautifulSoup也很新。我正在从ryan mtichell书中搜索网页。我抓的网站是http://www.pythonscraping.com/pages/page3.html

from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
html = urlopen("http://www.pythonscraping.com/pages/page3.html")
bs0bj = BeautifulSoup(html, "html.parser")
for i in bs0bj.find_all(id="gift1"):
    print(i.get_text())

#for i in bs0bj.find_all("tr", {"class":"gift"}):
#    print(i)
 #   for c in bs0bj.find_all("img", {"src":re.compile(\.\.\/img\/gifts/img.*\.jpg)}):
  #      print(c.image["src"])

我的问题是我想废弃1行礼品项目标题（“项目，描述，费用，图片）以及图像名称，如... img / gift.jpg但是直到我无法做som可以某人请帮我写出正确的代码

并且请解释这些代码以便我也能理解它......没有标签

Answer 1

这是你在找什么？

eb ssh

Answer 2

这是代码

from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
html = urlopen("http://www.pythonscraping.com/pages/page3.html")
soup = BeautifulSoup(html, "html.parser")
my_table =soup.find_all("table",id="giftList")
my_table =my_table[0]
rows = my_table.findChildren(['th', 'tr'])
for row in rows:
    cells = row.findChildren('td')
    for cell in cells:
        value = cell.string
        print ("The value in this cell is %s" % value)

网上有很多帮助，您可以查看。

python3.6中的beautifulsoup查询

2 个答案: