Question

我已经阅读了所有这些SO帖子并阅读了Selenium文档，我尝试了“ expected_conditions”，但是没有任何效果...

这就是我要做的-我正在构建一个抓取工具，并决定在亚马逊产品详细信息页面上对其进行测试-该页面上有一个div标签，其ID为：books-entity-teaser，该标签由当该标签在页面上可见时的JS代码...

但是，当我执行代码时，标记完全为空

有人可以指出我所缺少的吗

我尝试等待代码加载，然后获取页面源代码

WebDriverWait(self.browser, 10).until(expected_conditions.invisibility_of_element((By.ID, 'books-entity-teaser')))

这是我的python代码

def scrape(self, url: str = 'https://www.amazon.com/dp/1408865270'):
        self.browser.get(url)

        page_state = self.browser.execute_script('return document.readyState;')

        scroll_pause_time = 0.5

        # Get scroll height
        last_height = self.browser.execute_script('return document.body.scrollHeight')

        while True:
            # Scroll down to bottom
            self.browser.execute_script('window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;')

            # Wait to load page
            time.sleep(scroll_pause_time)

            # Calculate new scroll height and compare with last scroll height
            new_height = self.browser.execute_script('return document.body.scrollHeight')
            if new_height == last_height:
                break
            last_height = new_height

        page_source = self.browser.execute_script('return document.body.innerHTML')
        WebDriverWait(self.browser, 10).until(expected_conditions.presence_of_element_located((By.ID, 'books-entity-teaser')))

        return page_source

预期结果：

实际结果：

python硒刮擦渲染javascript

0 个答案: