我的Rvest刮刀太慢了

时间:2017-01-20 04:50:51

标签: r web-scraping rvest

我已经创建了一个Rvest刮刀,可以抓住一个工作列表网站。不幸的是,它只需要遍历100页。有没有快速解决方法使这更快?以下是我正在使用的基本结构

for(i in beginning:end){
  url <-  read_html(paste0("https://www.jobsite.com",links[[1]][i]))
  address[[i]] <-   html_nodes(x = url, css = selector_name) %>%
    html_text()
  employer[[i]] <- employer[[i]][3]
  rating[[i]] <-   html_nodes(x = url, css = selector_rating) %>%
    html_attr("data-jobsite") %>% as.numeric()
  rating[[i]] <- rating[[i]]*(10/6)
  rating[[i]] <- round(rating[[i]])
  rating[[i]] <- ifelse(length(rating[[i]]) == 0, 1, rating[[i]])
  title[[i]] <-   html_nodes(x = url, css = ".xsmall-10") %>%
    html_text()
  title[[i]] <- stri_replace_all_fixed(title[[i]], "            ", "")
  title[[i]] <- stri_replace_all_fixed(title[[i]], "        ", "")
  title[[i]] <- stri_replace_all_fixed(title[[i]], "\r\n", "")
  dd[[i]] <-   html_nodes(x = url, css = ".item-price")%>%
    html_text()
}

0 个答案:

没有答案