python3.6中的beautifulsoup查询

时间:2017-08-09 19:10:09

标签: beautifulsoup python-3.6

我对python和beautifulSoup也很新。我正在从ryan mtichell书中搜索网页。 我抓的网站是http://www.pythonscraping.com/pages/page3.html

from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
html = urlopen("http://www.pythonscraping.com/pages/page3.html")
bs0bj = BeautifulSoup(html, "html.parser")
for i in bs0bj.find_all(id="gift1"):
    print(i.get_text())

#for i in bs0bj.find_all("tr", {"class":"gift"}):
#    print(i)
 #   for c in bs0bj.find_all("img", {"src":re.compile(\.\.\/img\/gifts/img.*\.jpg)}):
  #      print(c.image["src"])

我的问题是我想废弃1行礼品项目标题(“项目,描述,费用,图片)以及图像名称,如... img / gift.jpg但是直到我无法做som可以某人请帮我写出正确的代码

并且请解释这些代码以便我也能理解它......没有标签

2 个答案:

答案 0 :(得分:1)

这是你在找什么?

eb ssh

答案 1 :(得分:0)

这是代码

from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
html = urlopen("http://www.pythonscraping.com/pages/page3.html")
soup = BeautifulSoup(html, "html.parser")
my_table =soup.find_all("table",id="giftList")
my_table =my_table[0]
rows = my_table.findChildren(['th', 'tr'])
for row in rows:
    cells = row.findChildren('td')
    for cell in cells:
        value = cell.string
        print ("The value in this cell is %s" % value)

网上有很多帮助,您可以查看。