抓取由AJAX加载的网页内容

时间:2019-04-09 10:39:28

标签: ruby-on-rails-4 selenium-webdriver web-scraping phantomjs watir

我有一个代码段正在抓取网页的内容。网页上的内容已由AJAX加载。我正在以循环方式抓取数据,每次结束时都出现以下错误之一:

1. Address already in use - bind(2) for 127.0.0.1:35216  
2. could not obtain a database connection within 5.000 seconds (waited 5.000 seconds)  
3. Net::ReadTimeout 

代码:

client = Selenium::WebDriver::Remote::Http::Default.new
browser = nil
browser = Watir::Browser.new :phantomjs, :http_client => client 
browser.window.maximize
browser.goto "some URL"
final_url = URI.parse(browser.url) 
#Sleep for 35 seconds, expecting data to get rendered by ajax
sleep(35)
unless pagecheck.css('li.some-class').empty?
    sleep(25)
end

0 个答案:

没有答案