我正在尝试使用RSelenium爬网网站上的数据。下面的脚本在RStudio中完美运行
library(RSelenium)
library(rvest)
library(xlsx)
library(XML)
library(RODBC)
library(taskscheduleR)
library(DBI)
con <- dbConnect(odbc::odbc(), .connection_string = "Driver={ODBC Driver 11 for SQL Server};server=HG-SOS-MI;database=Data_Testing;trusted_connection=yes")
driver<- rsDriver()
browser <- driver[["client"]]
browser$navigate("www.goal.com/en-us")
browser$maxWindowSize()
但是当我使用内置的taskScheduleR计划它时,我在错误日志中得到了以下内容:
Loading required package: methods
Warning message:
package 'DBI' was built under R version 3.4.4
Loading required package: xml2
Loading required package: rJava
Loading required package: xlsxjars
Warning messages:
1: package 'xlsx' was built under R version 3.4.3
2: package 'rJava' was built under R version 3.4.3
Attaching package: 'XML'
The following object is masked from 'package:rvest':
xml
Warning message:
package 'taskscheduleR' was built under R version 3.4.3
checking Selenium Server versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
checking chromedriver versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
checking geckodriver versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
checking phantomjs versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
Error in subprocess::spawn_process(tfile, ...) :
could not create process: Access is denied
Calls: rsDriver ... spawn_tofile -> windows_spawn_tofile -> <Anonymous> -> .Call
Execution halted
R version 3.5.0 (2018-04-23)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] subprocess_0.8.2 odbc_1.1.6 shiny_1.1.0 taskscheduleR_1.1 RODBC_1.3-15
[6] stringr_1.3.1 rlist_0.4.6.1 plyr_1.8.4 rvest_0.3.2 xml2_1.2.0
[11] RSelenium_1.7.1 XML_4.0-0 RPostgreSQL_0.6-2 DBI_1.0.0
loaded via a namespace (and not attached):
[1] promises_1.0.1 bitops_1.0-6 bit_1.1-14 pkgconfig_2.0.1 blob_1.1.1
[6] compiler_3.5.0 xtable_1.8-2 wdman_0.2.2 Rcpp_0.12.17 httr_1.3.1
[11] tools_3.5.0 openssl_1.0.1 R6_2.2.2 semver_0.2.0 assertthat_0.2.0
[16] curl_3.2 digest_0.6.15 mime_0.5 miniUI_0.1.1.1 stringi_1.1.7
[21] caTools_1.17.1 htmltools_0.3.6 hms_0.4.2 bit64_0.9-7 data.table_1.11.4
[26] httpuv_1.4.4.1 binman_0.1.0 rlang_0.2.1 magrittr_1.5 rappdirs_0.3.1
[31] yaml_2.1.19 later_0.7.3 jsonlite_1.5
> rD <- rsDriver()
checking Selenium Server versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
checking chromedriver versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
checking geckodriver versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
checking phantomjs versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
[1] "Connecting to remote server"
Selenium message:unknown error: DevToolsActivePort file doesn't exist
答案 0 :(得分:3)
我也遇到了这个问题,并使用“窗口任务计划程序”解决了这个问题。
1。首先,使用taskscheduler_create函数创建计划。
Jobplanet <- "Directory where your R script located."
taskscheduler_create(taskname = "R03_Jobpnanet_Review_Crawling", rscript = Jobplanet,
schedule = "MONTHLY", starttime = "17:20", startdate = format(Sys.Date(), "%Y/%m/%d"),
days = 31)
3。选中“运行是否登录用户”,然后将“配置为”选项更改为“ Windows7,Windows Server 2008 R2”
请尝试。对我来说很好。