BeautifulSoup:获取特定类别的所有产品链接

时间:2016-04-13 11:23:54

标签: python-2.7 web-scraping beautifulsoup

我希望通过在Python中使用BeautifulSoup来获取特定类别的所有产品链接。

我尝试过以下但未获得结果:

import lxml
import urllib2
from bs4 import BeautifulSoup
html=urllib2.urlopen("http://www.bedbathandbeyond.com/store/category/bedding/bedding/quilts-coverlets/12018/1-96?pagSortOpt=DEFAULT-0&view=grid")

br= BeautifulSoup(html.read(),'lxml')

for links in br.findAll('a', class_='prodImg'):
    print links['href']

2 个答案:

答案 0 :(得分:1)

你使用urllib2错误。

import lxml
import urllib2
from bs4 import BeautifulSoup

#create a http request
req=urllib2.Request("http://www.bedbathandbeyond.com/store/category/bedding/bedding/quilts-coverlets/12018/1-96?pagSortOpt=DEFAULT-0&view=grid")
# send the request
response = urllib2.urlopen(req)
# read the content of the response
html = response.read()
br= BeautifulSoup(html,'lxml')

for links in br.findAll('a', class_='prodImg'):
    print links['href']

答案 1 :(得分:0)

from bs4 import BeautifulSoup
import requests

html=requests.get("http://www.bedbathandbeyond.com/store/category/bedding/bedding/quilts-coverlets/12018/1-96?pagSortOpt=DEFAULT-0&view=grid")

br= BeautifulSoup(html.content,"lxml")

data=br.findAll('div',attrs={'class':'productShadow'})
for div in br.find_all('a'):
        print div.get('href')

试试这段代码