在网址不包含页码的情况下,用多个页面的动态加载数据来抓取网站

时间:2019-12-17 07:38:32

标签: python selenium web-scraping beautifulsoup webdriverwait

我正在尝试从通过javascript加载数据的网站上抓取数据,并且主url不包含页码。网址为site link。为了报废数据,我正在使用python编程语言&硒。但是我只有第一页数据,而没有第一页,第二页,第三页等其他页面。

我已经提到了我正在问候单页数据的代码。

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager


from bs4 import BeautifulSoup
from selenium import webdriver
url ='https://www.ikh.se/sv/kemikalier--smorjmedlar/fordon-kemikalier'

browser = webdriver.Chrome(ChromeDriverManager().install())

browser.get(url)
data = browser.find_element_by_id('dnsList')
dataList = data.find_elements_by_tag_name('a')
i = 1
for item in dataList:
    productName = item.find_element_by_class_name('ProdName')
    productPrice = item.find_element_by_class_name('ProdPrice')
    productCode = item.find_element_by_class_name('ProdCode')
    pName = productName.text
    pPrice = productPrice.text
    pCode = productCode.text

    print( '#SL : ' + str(i) + '  Product Code : ' + pCode + '-->' + 'ProductName : ' + pName + ' Product Price : ' + pPrice)
    i = i + 1

2 个答案:

答案 0 :(得分:1)

有一个“ PRODUKTLIST-PDF”

func scene(_ scene: UIScene,
           willConnectTo session: UISceneSession,
           options connectionOptions: UIScene.ConnectionOptions) {

    if let userActivity = connectionOptions.userActivities.first {
        self.scene(scene, continue: userActivity)
    }
}

在每个菜单上获得<form id="pdf-form" class="pdf-form" action="/getProductListPDF.asp?sua=2&amp;lang=2&amp;navid=14456615" method="POST" target="_blank"> 的位置,可以将其添加到api的navid中,以json的形式获取菜单的产品详细信息,

api(每个菜单仅进行朴素的更改)

navid

答案 1 :(得分:1)

要提取所有产品名称产品价格产品代码,您必须诱使 WebDriverWait visibility_of_all_elements_located(),您可以使用以下Locator Strategies:可以使用以下Locator Strategies

  • 代码块:

    driver.get("https://www.ikh.se/sv/kemikalier--smorjmedlar/fordon-kemikalier")
    product_names = [my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.ProdName")))]
    product_prices = [my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.ProdPrice>span.currency-price")))]
    product_codes = [my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.ProdCode")))]
    pages = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.dnsListPager.dnsCell.tablet-one-fifth button.dnsListPage:not(.active)")))
    for i in range(len(pages)):
        pages[i].click()
        product_names.extend([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.ProdName")))])
        product_prices.extend([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.ProdPrice>span.currency-price")))]) 
        product_codes.extend([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.ProdCode")))])
    for i,j,k in zip(product_names, product_codes, product_prices):
        print("{} code is {} and price is {}".format(i,j,k))
    
  • 控制台输出:

    TVÄTTMEDEL FÖR LÅNGTRADARE KONSENTRAT 1L code is WKPR010 and price is 462.00
    BILTVÄTT,KONSENTRAT 5L VIIMA code is WKPA005 and price is 218.90
    BÅTTVÄTT,KONSENTRAT code is WKPV005 and price is 218.90
    SILICONSTIFT 65ML code is WKSI001 and price is 39.60
    VAXSHAMPOO PS-POLYGEN 10L code is HK70070 and price is 363.00
    VAXSHAMPOO PS-POLYGEN 200L code is HK70071 and price is 5555.00
    GLAS RENGÖRARE 500ML code is TIP731 and price is 86.90
    POLISH &amp; VAX 500ML code is TIP740 and price is 126.50
    COLOUR RENOVATOR code is TIP741 and price is 126.50
    SCRATCH REMOWER 250ML code is TIP745 and price is 86.90
    KROMI POLISH code is TIP742 and price is 126.50
    SUPER SHAMPOO &amp; VAX 500ML code is TIP743 and price is 86.90
    SUPERFAST VAX 500ML code is TIP744 and price is 126.50
    BUMPER BLACK 500 ML code is TIP753 and price is 126.50
    SHAMPOO WASH &amp; POLISH 2X500ML INK. SVAMP code is TIP756 and price is 71.50
    POLERING SET BIL code is MIR100AUTO and price is 405.90
    GLASTVÄTT SUPER 1L code is WKLA001 and price is 53.90
    AUTOTVÄTT 2,5L code is WKS002 and price is 82.50
    EFFEKTTVÄTT GEL 1L code is WKTP001 and price is 86.90
    FÄLGTVÄTT GEL 1L code is WKVA001 and price is 97.90
    POWERTVÄTT 4L PINELINE code is PL100004 and price is 207.90
    POWERTVÄTT 10L PINELINE code is PL100010 and price is 471.90
    POWERTVÄTT 200L PINELINE code is PL100200 and price is 7645.00
    HEAVY DUTY 25L PINELINE code is PL133020 and price is 2035.00
    RENGÖRINGSMEDELL 500ML GLAS MONTERING code is SIK037 and price is 93.50
    CAR 25L PINELINE code is PL160025 and price is 2035.00
    GO 25L PINELINE code is PL163025 and price is 2035.00
    POWER CLEANER 0,5L PINELINE code is PL220005 and price is 108.90
    FÄLGTVÄTT 0,5L PINELINE code is PL350005 and price is 93.50
    INSEKT ELIMINERING 0,5L PINELINE code is PL373005 and price is 82.50
    GLASS CLEANER 0,5L PINELINE code is PL444005 and price is 75.90
    SKUM TVÄTT 25L PINELINE AKTIV SKUM code is PL500025 and price is 1210.00
    POWERTVÄTT 25L PINELINE code is PL100025 and price is 1089.00
    POWERTVÄTT SUPER 25L PINELINE code is PL130025 and price is 1419.00
    CAR+GO 25L PINELINE code is PL200025 and price is 748.00
    FÄLGTVÄTT 4L PINELINE code is PL350004 and price is 438.90
    SKUM TVÄTT 4L PINELINE AKTIV SKUM code is PL500004 and price is 438.90
    KRAFTAVFETTNING 1L MED PUMP code is HAG001 and price is 82.50
    KRAFTAVFETTNING 4L code is HAG002 and price is 328.90
    FÖRTVÄTT PLUS 1L PUMP code is HAG003 and price is 82.50
    FÖRTVÄTT PLUS 5L code is HAG004 and price is 328.90
    SUPER TVÄTT 500ML code is HAG005 and price is 64.90
    VAX TVÄTT 500ML code is HAG006 and price is 64.90
    SKUM TVÄTT 500ML code is HAG007 and price is 64.90
    SNABB TVÄTT 1L PUMP code is HAG008 and price is 109.45
    INSEKTSBORTTAGARE 0,5 L PUMP code is HAG010 and price is 75.90
    FÄLGRENGÖRING 500ML PUMP code is HAG011 and price is 64.90
    FÖNSTERPUTS 500ML PUMP code is HAG012 and price is 53.90
    INTRRIÖRRENGÖRING 500ML PUMP code is HAG013 and price is 64.90
    TEXTIL RENGÖRING 500ML PUMP code is HAG014 and price is 64.90
    LÄDER RENGÖRING 300ML code is HAG015 and price is 75.90
    ROSTÄTARE 300ML code is HAG023 and price is 108.90
    ROSTOMVANDLARE 150ML code is HAG024 and price is 148.50
    UNDERREDSMASSA 1KG SVART code is HAG026 and price is 126.50
    BROMSVÄTSKA 0,5L DOT 4 code is WKJ0005 and price is 43.45
    BROMSVÄTSKA DOT 4 4L code is WKJ004 and price is 284.90
    LHM PLUS SPECIALOLJA 1L code is WKLHM1 and price is 71.50
    BASF HYDRAULAN 404 BROMSVÄTSKA DOT 4 1L code is BASF404-1 and price is 108.90
    KYLARVÄTSKA 1L 100% code is WK001100 and price is 44.00
    KYLARVÄTSKA 3L 100% code is WK003100 and price is 124.30
    KYLARVÄTSKA 5L 100% code is WK005100 and price is 192.50
    KYLARVÄTSKA 100% 10L code is WK010100 and price is 328.90
    KYLARVÄTSKA 200L 100% code is WK200100 and price is 6380.00
    KYLARVÄTSKA 100% SUPER HD 5L code is WK005100P and price is 229.90
    KYLARVÄTSKA 100% SUPER HD 10L code is WK010100P and price is 423.50
    REPARATIONS MEDEL FÖR KYLSYSTEM code is TSL001 and price is 68.20
    KENT RADIATOR LEAK STOP 250ML code is KENT86182 and price is 438.90
    KYLARVÄTSKA 100% LL OAT 10L code is WK010100K and price is 438.90
    BASF KYLARVÄTSKA G30 1,5L KONCENTRERAD code is GLYSG30-1 and price is 143.00
    BASF KYLARVÄTSKA G40 1,5L KONCENTRERAD code is GLYSG40-1 and price is 148.50
    BASF GLYSANTIN G48 1,5L KONCENTRERAD code is GLYSG48-1 and price is 148.50
    BASF KYLÄRVÄTSKA G64 5L FÄRDIGBLANDAD 50% code is GLYSG64-5R and price is 407.00
    POLERINGSPASTA SATS 2ST EFTERBEHANDLING code is TPG35937 and price is 76.45
    POLERINGSPASTA ASSORTMENT 2ST code is TPG35938 and price is 76.45
    POLARSHINE LIQUID NANO WAX 1L code is MIR1491 and price is 363.00
    POLARSHINE NANO ANTISTAT WAX UF3 1L code is MIR1492 and price is 330.00
    POLARSHINE 35 1L code is MIR1494 and price is 352.00
    POLARSHINE 10 1L code is MIR1561 and price is 330.00
    POLERING OCH SLIPNING SERIE FÖR STRÅLKASTARE code is OB039 and price is 181.50
    POLARSHINE 35 250ML code is MIR1580 and price is 154.00
    POLARSHINE 10 250ML code is MIR1581 and price is 162.80
    POLERING SET BÅT code is MIR100VENE and price is 319.00
    POLARSHINE WAX LIQUID NANO 500ML code is MIR1622 and price is 231.00
    POLARSHINE 20 250ML code is MIR1699 and price is 242.00
    POLARSHINE 20 1L code is MIR1700 and price is 495.00
    POLISH GROV RUBBING 300ML code is HAG016 and price is 75.90
    POLISH FIN LACKRENGÖRING 300ML code is HAG017 and price is 75.90
    VAXPOLISH 300ML code is HAG018 and price is 75.90
    GLANSVAX 300ML code is HAG019 and price is 75.90
    LACKFÖRSEGLING 300ML code is HAG020 and price is 97.90
    DÄCK OCH FÄLGGLANS 500ML PUMP code is HAG021 and price is 97.90
    POLARSHINE   5 250ML code is MIR1709 and price is 187.00
    POLARSHINE   5 1L code is MIR1710 and price is 495.00
    SOMMARSPOLARVÄSTKA 4L code is WKKE004 and price is 30.25