如何在python webscraping中添加延迟

时间:2017-09-22 08:51:07

标签: python web-scraping beautifulsoup web-crawler

我是网络抓取和python的新手我遇到了一个网站,它在加载页面一段时间之后加载了价格我不知道如何从该网站提取数据this is the sample link 是否有任何模块在网站加载后拉数据我目前正在使用美丽的汤 我的脚本就像这样

from bs4 import BeautifulSoup
import urllib.request
import pandas as pd

url = 'https://www.northcoastelectric.com/157299/Category/PowerFlex-520-
Family-Options-&-Accessories'
html = urllib.request.urlopen(url)
soup = BeautifulSoup(html)
products = soup.find_all(id="getchangemode")
# print(product)

for product in products:
    product_details = dict()
    product_details["Main link"] = url
    image = product.find(class_="prodImage").img["src"]
    product_details["Image Link"] = image
    cate = soup.find(class_="cimm_pageTitle").get_text().strip()
    product_details["Category"] = cate
    path_tag = soup.find(class_="cimm_breadcrumbs").stripped_strings
    path = (" ").join([j.strip() for j in path_tag])
    product_details["Path"] = path
    additional_info = []
    for i in x:
        additional_info.append(("").join([j.strip() for j in i.stripped_strings]))
        print(additional_info)
        product_details['Additional_info'] = additional_info
        print(path)
        vend = product.h4.get_text().strip()
        print(vend)
        product_details["Vendor Description"] = vend
        print(product_details)
        brand = product.b.get_text().strip()
        print(brand)
        product_details["Brand"] = brand
        addinfo = product.p.get_text()
        print(addinfo)
        product_details["Additional Info"] = addinfo
        price = soup.find(class_= 'priceLeft')
        print(price)

提前致谢

0 个答案:

没有答案