Question

我是硒的新手，并希望通过Udemy课程链接报废价格和提供结束时间。我该怎么办？

价格和课程结束时间会动态加载到网站上。我知道如何从网站中提取简单内容，而不是动态内容。

我尝试使用 Parsel Library + Seleminium Library ，但返回空字符串。因为当我在手机中查看源网站时，源中没有显示价格。但是，当我单击铬或Firefox的检查元素选项时。价格在span标签内提供。意味着在浏览器上呈现页面时，价格会动态加载。我如何在Selenium中做到这一点？

这是Udemy Course链接的示例：

https://www.udemy.com/course/data-science-deep-learning-in-python/

Answer 1

在您的环境中已经安装了所有依赖项的情况下，此代码应该可以工作：

    from selenium import webdriver
    from bs4 import BeautifulSoup
    from webdriver_manager.chrome import ChromeDriverManager

    driver = webdriver.Chrome(ChromeDriverManager().install())
    driver.get("https://www.udemy.com/course/appium-selenium-for-mobile-automation-testing/")

    content = driver.page_source

    soup = BeautifulSoup(content, 'html.parser')

    price = soup.find('div', {'class':'price-text--price-part--Tu6MH udlite-clp-discount-price udlite-heading-xl'})
    if price is not None:
        price.text.strip()
        price = price.replace('Current price','')
        print('Price: ' + price)
        
        offerEndTime = soup.find('span', {'data-purpose':'safely-set-inner-html:discount-expiration:expiration-text'}).text.strip()
        print('Offer end time: ' +  offerEndTime)
    else:
        print('This is a free course')

如何使用Selenium从动态网站中抓取数据

1 个答案: