如何使用Python检索具有相同URL的多个页面?

时间:2018-12-21 12:01:21

标签: python beautifulsoup web-crawler

我正在尝试从IHerb.com获得产品的所有评论。

https://www.iherb.com/r/California-Gold-Nutrition-Omega-3-Premium-Fish-Oil-100-Fish-Gelatin-Softgels/62118

但是问题是,有多个页面具有相同的url。

我该如何解决?这是我一页的代码(实际上不起作用)。谢谢。

html iHerb.com pages

html iHerb.com review

import requests
from bs4 import BeautifulSoup

url = 'https://www.iherb.com/r/California-Gold-Nutrition-Omega-3-Premium-Fish-Oil-100-Fish-Gelatin-Softgels/62118'
response = requests.get(url)
page = response.text
soup = BeautifulSoup(page, 'html.parser')
links = soup.find_all("div", {"class": "review-test"})

for each in links:
   print(each.text)

1 个答案:

答案 0 :(得分:0)

使用Python进行网页抓取通常只需要使用Beautiful Soup模块即可达到目标。但是,iherb使用JavaScript链接。因此,您的代码不仅仅适用于漂亮的汤类库。 您可以使用Selenium自动执行Web浏览器交互。使用Selenium,可以编写Python脚本来自动化Web浏览器。之后,这些讨厌的JavaScript链接不再是问题。 Selenium启动浏览器会话。为了使Selenium正常工作,它必须访问浏览器驱动程序。默认情况下,它将与Python脚本位于同一目录中。以下示例代码使用Chrome:

from selenium import webdriver
import time
from bs4 import BeautifulSoup as soup
browser = webdriver.Chrome()
browser.get("https://www.iherb.com/r/California-Gold-Nutrition-Omega-3-Premium-Fish-Oil-100-Fish-Gelatin-Softgels/62118")
source_data = browser.page_source
page_soup = soup(source_data, "html.parser")
links=page_soup.findAll('div',{"class":"review-text"})
for each in links:
    print(each.text)

硒的链接为Selenium packages 希望对您有所帮助:)

结果

I'm so glad i found this.  Most fish gel are made of gelatin and since i'm looking 
for only halal source, i'm glad i found this brand with an excellent price to match!

We have consumed 3 bottles n i love it.  It makes my breastmilk thicker too.  

Highly recommended!
If only Pharma companies realize that they are missing huge Muslim consumers by using 
Pork geltin. Havent tried it yet, but have mostly everything i was looking for. 
except if it was once a day capsule with added vitamin D3. i have been ordering from 
iHerb for years now from UAE and this has to be one of the best online shopping with 
Fast DHL shipping (Always). I hope they consider stocking more Halal or Kosher 
Gelatin Medicine.  

6th Jan 2016 Update: I started with a 5 Star with above comments, but now after 
consuming for some time i noticed my LDL level have increased. So 2 stars for now.
I've never bought FISH OIL / supplement in capsule/softgel except it is made from 
fish or vege. This is the best Omega 3 in the market. it is good for cardiovascular 
and using selected small fishes like Sardines, Mackerel and Anchovies which are less 
toxin compared to big fishes with high toxin over their longer lives. Furthermore I 
did test this fish oil and it didn't dissolve Styrofoam cup indicating there is no 
ethanol used in the process of getting oil as we know that ethanol is not good for 
our organs in long term (search "ethanol fish oil" in Youtube)... Click on my name 
HAFIZ (the above green text) to get more info.
I love this omega 3 fish oil very much because it's very cheap yet effective and 
doesn't contain any harmful chemicals. I've already tried a lot of brands of omega 3 
and this one is one of the best and effective. Also, when using this I noticed that 
I'm more energized and my body is stronger and healthier. >>Press "click meeee"button 
to view and use my code for Discount at Checkout
Best with great price
So good

good product like website say 
very good product. 
I bought the product two months ago, and still didn't get it. what is the 
problem? 
Could you please check it?
good!
Exelent product 
Love it use it more than 4 months