Question

我正在使用scrapy抓取所有链接，使用硒抓取所有页面。 Selenium刮掉了大部分页面，但由于页面需要花费一些时间，因此留下了几页。

我尝试了timeout（），但是似乎没有用，然后我尝试了execute_script

driver.execute_script（“ return document.readyState ==” complete“;”）

这似乎也没有用，然后我尝试了Expected_conditions

WebDriverWait.until（expected_conditions.execute_script（“ return document.readyState ==“ complete”;“））

但似乎不起作用

我正在使用Firefox浏览器，phantomJs for Headless 使用Chrome驱动程序尝试过使用brew cask install chromedriver安装，但我遇到了这个错误

raise WebDriverException（“无法连接到服务％s”％ self.path）selenium.common.exceptions.WebDriverException：消息：可以无法连接到服务chromedriver

回到phantomjs。

谢谢！

Answer 1

充分利用睡眠功能，可帮助您在网页加载时延迟代码运行时间

Answer 2

我以前有这个问题。我在try中使用了while循环，但在其中除外。 Loop将继续尝试完成您已完成的工作。如果页面没有加载，它将进入页面，除非页面会通过。但是，当它进入try块并成功执行时，则可以在try块的末尾使用break来退出循环。这对我来说是100％的时间。

Answer 3

raise WebDriverException("Can not connect to the Service %s" % self.path) selenium.common.exceptions.WebDriverException: Message: Can not connect to the Service chromedriver

之所以出现此问题，是因为您的程序无法通过给定的chromedriver.exe连接到服务，这可能是由于版本不匹配或可执行文件不可用而引起的。

您可以按以下步骤解决它：

检查系统上使用的chrome浏览器的版本，可以在Chrome设置>关于Chrome中检查它。然后下载 chromedriver在这里： https://chromedriver.chromium.org/downloads
您可以将其存储在任何地方，但最好将其保存在同一位置目录作为您的代码。解压缩并将其复制到相应的目录，您最好使用chromedriver。
取消注释此driver = webdriver.Chrome()或使用driver = webdriver.Chrome(executable_path=r'your path here')（如果它与您的程序不在同一目录中）。

页面加载后如何使硒刮页面

3 个答案: