我想通过以下代码在动态网页上抓取数据:
> URL<- "http://www.cbooo.cn/realtime"
> library(bitops)
> library(RCurl)
> library(XML)
> library(RSelenium)
> library(magrittr)
> checkForServer()
Warning message:
checkForServer is deprecated.
Users in future can find the function in file.path(find.package("RSelenium"), "example/serverUtils").
The sourcing/starting of a Selenium Server is a users responsiblity.
Options include manually starting a server see vignette("RSelenium-basics", package = "RSelenium")
and running a docker container see vignette("RSelenium-docker", package = "RSelenium")
> startServer()
$stop
function ()
{
tools::pskill(selPID)
}
<environment: 0x10991af0>
$getPID
function ()
{
return(selPID)
}
<environment: 0x10991af0>
Warning message:
startServer is deprecated.
Users in future can find the function in file.path(find.package("RSelenium"), "example/serverUtils").
The sourcing/starting of a Selenium Server is a users responsiblity.
Options include manually starting a server see vignette("RSelenium-basics", package = "RSelenium")
and running a docker container see vignette("RSelenium-docker", package = "RSelenium")
> remDrv <- remoteDriver()
> remDrv$browserName="Internet Explorer"
> remDrv$open()
[1] "Connecting to remote server"
Selenium message: The best matching driver provider org.openqa.selenium.ie.InternetExplorerDriver can't create a new driver instance for Capabilities [{nativeEvents=true, browserName=Internet Explorer, javascriptEnabled=true, version=, platform=ANY}]
Build info: version: '2.53.1', revision: 'a36b8b1', time: '2016-06-30 17:37:03'
System info: host: 'DESKTOP-J0D980N', ip: '10.36.17.76', os.name: 'Windows 10', os.arch: 'x86', os.version: '10.0', java.version: '1.8.0_77'
Driver info: driver.version: unknown
Error: Summary: UnknownError
Detail: An unknown server-side error occurred while processing the command.
class: org.openqa.selenium.WebDriverException
Further Details: run errorDetails method
我无法解决以下问题: 1 checkForServer,不推荐使用startServer。 2连接到远程服务器总是fials,我不知道如何设置这个功能的一些argurment和应该做什么 我希望尽快得到一个回复,谢谢。
答案 0 :(得分:1)
RSelenium的作者提供了以下解决方案(https://github.com/ropensci/RSelenium/issues/81):
从Firefox 48开始,需要使用Gecko驱动程序/牵线木偶来运行带有Selenium的Firefox。
如果你有Firefox 48,你可以按如下方式运行gecko驱动程序:
参阅指南
https://developer.mozilla.org/en-US/docs/Mozilla/QA/Marionette/WebDriver
从中下载相关的壁虎驱动程序 https://github.com/mozilla/geckodriver/releases
将其添加到PATH或在启动二进制文件时参考位置(见下文)
# get beta selenium standalone
RSelenium::checkForServer(beta = TRUE)
# assume gecko driver is not in our path (assume windows and we downloaded to docs folder)
# if the driver is in your PATH the javaargs call is not needed
selServ <- RSelenium::startServer(javaargs = c("- Dwebdriver.gecko.driver=\"C:/Users/john/Documents/geckodriver.exe\""))
remDr <- remoteDriver(extraCapabilities = list(marionette = TRUE))
remDr$open()
....
....
remDr$close()
selServ$stop()
答案 1 :(得分:0)
为了生成可行的工作解决方案,我会使用旧版本的RSelenium和所有代码。
#!/bin/sh
json_lines=$(jq -r '.modules[] | select(.path == ["root"]) | .outputs | tojson' terraform.tfstate)
for json_line in $json_lines; do
ini_values=$(echo $json_line | jq -r '. | to_entries | map("\(.key)=\(.value.value|tostring)") | .[]')
if [[ -n $ini_values ]]; then
cat <<EOF >> terraform_outputs.ini
[terraform]
$ini_values
EOF
fi
done
这不是最佳解决方案。但是一个有效的解决方案。