如何跳转到特定的页面usinig Beautifulsoup

时间:2016-08-09 11:20:16

标签: python beautifulsoup web-crawler

我想获取用户在python中搜索的产品数据。我能够从任何网址获取数据,但取决于搜索跳转到该页面并获取数据使用beautifulsoup。

我试着获取数据:

from bs4 import BeautifulSoup
import requests
import urllib2

 url="http://amazon.in"
 con=urllib2.urlopen(url).read()    
 soup=BeautifulSoup(con)
 print soup.prettify()

但是如果用户想要IPhone 5s的价格,那么它将跳转到该产品页面并获取数据。

我是怎么做到的。

1 个答案:

答案 0 :(得分:0)

你只需要通过正确的参数传递获取请求:

import requests
from bs4 import BeautifulSoup

params = {"url":"search-alias=aps","field-keywords":"iphone 5"}
url = "http://www.amazon.in/s/ref=nb_sb_noss_2"


soup = BeautifulSoup(requests.get(url, params=params).content)
ul = soup.select_one("#s-results-list-atf")

ul将包含您在页面上看到的所有搜索结果。如果我们运行代码并在每个锚点中找到h2标签,您可以看到页面上显示的项目名称/描述。

In [6]: ul = soup.select_one("#s-results-list-atf")

In [7]: for h2 in ul.select("li a h2"):
   ...:         print(h2.text)
   ...:     
Apple iPhone 5s (Space Grey, 16GB)
Apple iPhone 5s (Silver, 16GB)
Supra Lightning 8 Pin To Micro Usb Charge Sync Data Connector Adapter Iphone 5 Ipad 4
OnePlus 3 (Graphite, 64GB)
Apple iPhone 5 (Black-Slate, 16GB)
ROCK 695029068729 Royce Series Shockproof Dual Layer Back Case Cover for Apple iPhone 5 5S,(Grey)
Apple iPhone 5c (White, 8GB)
iSAVE Soft Silicone Grid Design Back Case Cover For iPhone 5/5s (BLACK)
iPaky AT15312 360 Protective Body Case with Tempered Glass for Apple iPhone SE 5 5S,(Black)
Aeoss 9Pcs Open Pry Screwdriver Repair Tool Kit Set For iPhone 6 Plus 5 5s 5c 4 iPod.
2 IN 1 Tempered Glass for Iphone 5 5s 5c Explosion Proof Tempered Glass (FRONT AND BACK)
Itab iphone5sclearsoftgelly Imported Transparent Clear Silicone Jelly Soft Case Back Cover For Apple Iphone 5 5S
Shivam Earphones EarPods Handsfree Headphones for Apple iPhone 4/4s/5/5s/6/6+ (White)
USB Power Adapter Wall Charger&Data Cable for iPhone 5/5S/5C/6
Generic Ios 7 Compatible Data Sync Charging Cable For Apple Iphone 5 5S 6 - White
Tempered Glass Screen Protector Scratch Guard for Apple Iphone 5 5G 5s