我正在尝试使用twitteR包下载Twitter数据。
我一直收到错误消息
“函数错误(type,msg,asError = TRUE):无法连接到主机”
我相信这是因为我在工作计算机上这样做,我需要传递代理服务器的详细信息。
为了测试这一点,我尝试了一个关于Proxy Setting for R的类似问题的答案之一的例子。
如果我输入:
library("RCurl")
getURL("http://stackoverflow.com")
然后我收到与尝试使用twitteR时相同的错误消息:
“函数错误(type,msg,asError = TRUE):无法连接到主机”
但是,如果我传递代理服务器的详细信息,那么它没有问题:
library("RCurl")
opts <- list(
proxy = "123.456.7.89",
proxyusername = "tumbledown",
proxypassword = "mypassword",
proxyport = 8080
)
getURL("http://stackoverflow.com", .opts = opts)
但是,我将代理服务器的详细信息传递给 twitteR 时遇到问题。我尝试使用以下方法在R的 Rprofile.site 文件中设置它:
http_proxy="http://tumbledown:mypassword@123.456.7.89:8080/"
但它似乎没有做任何事情来解决问题。我哪里错了?
编辑1:以下是我正在尝试运行的代码,现在我看一下它让我意识到这可能更像是一个ROAuth问题:
library("twitteR")
library("ROAuth")
library("RCurl")
Credentials <- OAuthFactory$new(
consumerKey = "MY_CONSUMER_KEY",
consumerSecret = "MY_CONSUMER_SECRET",
requestURL = "https://api.twitter.com/oauth/request_token",
authURL = "https://api.twitter.com/oauth/authorize",
accessURL = "https://api.twitter.com/oauth/access_token")
# I have then tried both of the below handshake methods:
# 1
Credentials$handshake()
# 2
download.file(url="http://curl.haxx.se/ca/cacert.pem", destfile="cacert.pem")
Credentials$handshake(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl"))
编辑2:
以下代码似乎让我分道扬..如果我设置这些选项,那么我可以开始与Twitter的握手过程(间歇性地,它有时仍会失败)。
options(RCurlOptions = list(
verbose = TRUE,
proxy ="http://123.456.7.89:8080",
proxyuserpwd="tumbledown:mypassword",
proxyauth="ntlm"))
然后我被要求在关注URL之后从Twitter输入一个PIN(由于某种原因我不得不费力地键入它,它不会让我复制/粘贴它)。然后我似乎在完成握手之前完成了握手。这是详细的输出(删除/更改了一些细节):
* About to connect() to proxy 123.456.7.89 port 8080 (#0)
* Trying 123.456.7.89... * connected
* Connected to 123.456.7.89 (123.456.7.89) port 8080 (#0)
* Establish HTTP proxy tunnel to api.twitter.com:443
> CONNECT api.twitter.com:443 HTTP/1.1
Host: api.twitter.com:443
Proxy-Connection: Keep-Alive
< HTTP/1.1 407 Proxy Authentication Required ( Forefront TMG requires authorization to fulfill the request. Access to the Web Proxy filter is denied. )
< Via: 1.1 ORG-TMG1
< Proxy-Authenticate: Negotiate
< Proxy-Authenticate: Kerberos
< Proxy-Authenticate: NTLM
< Connection: close
< Proxy-Connection: close
< Pragma: no-cache
< Cache-Control: no-cache
< Content-Type: text/html
< Content-Length: 719
<
* Ignore 719 bytes of response-body
* Received HTTP code 407 from proxy after CONNECT
* About to connect() to proxy 123.456.7.89 port 8080 (#0)
* Trying 123.456.7.89... * connected
* Connected to 123.456.7.89 (123.456.7.89) port 8080 (#0)
* Establish HTTP proxy tunnel to api.twitter.com:443
* Proxy auth using NTLM with user 'ORG\tumbledown'
> CONNECT api.twitter.com:443 HTTP/1.1
Host: api.twitter.com:443
Proxy-Authorization: NTLM <LOTS OF RANDOM LETTERS>==
Proxy-Connection: Keep-Alive
< HTTP/1.1 407 Proxy Authentication Required ( Access is denied. )
< Via: 1.1 ORG-TMG1
< Proxy-Authenticate: NTLM <LOTS OF RANDOM LETTERS>==
< Connection: Keep-Alive
< Proxy-Connection: Keep-Alive
< Pragma: no-cache
< Cache-Control: no-cache
< Content-Type: text/html
< Content-Length: 0
<
* Establish HTTP proxy tunnel to api.twitter.com:443
* Proxy auth using NTLM with user 'ORG\tumbledown'
> CONNECT api.twitter.com:443 HTTP/1.1
Host: api.twitter.com:443
Proxy-Authorization: NTLM <LOTS OF RANDOM LETTERS>=
Proxy-Connection: Keep-Alive
< HTTP/1.1 200 Connection established
< Via: 1.1 ORG-TMG1
< Connection: Keep-Alive
< Proxy-Connection: Keep-Alive
<
* Proxy replied OK to CONNECT request
* successfully set certificate verify locations:
* CAfile: \\ORG-nas/tumbledown/R/win-library/2.15/RCurl/CurlSSL/cacert.pem
CApath: none
* SSL connection using RC4-SHA
* Server certificate:
* subject: C=US; ST=California; L=San Francisco; O=Twitter, Inc.; OU=Twitter Security; CN=api.twitter.com
* start date: 2013-04-08 00:00:00 GMT
* expire date: 2013-12-31 23:59:59 GMT
* subjectAltName: api.twitter.com matched
* issuer: C=US; O=VeriSign, Inc.; OU=VeriSign Trust Network; OU=Terms of use at https://www.verisign.com/rpa (c)09; CN=VeriSign Class 3 Secure Server CA - G2
* SSL certificate verify ok.
> POST /oauth/access_token HTTP/1.1
Host: api.twitter.com
Accept: */*
Content-Length: 297
Content-Type: application/x-www-form-urlencoded
< HTTP/1.1 200 OK
< cache-control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0
< content-length: 160
< content-type: text/html; charset=utf-8
< date: Tue, 23 Apr 2013 11:47:21 GMT
< etag: "<LOTS OF RANDOM LETTERS>"
< expires: Tue, 31 Mar 1981 05:00:00 GMT
< last-modified: Tue, 23 Apr 2013 11:47:21 GMT
< pragma: no-cache
< server: tfe
< set-cookie: _twitter_sess=<LOTS OF RANDOM LETTERS>--<LOTS OF RANDOM LETTERS>; domain=.twitter.com; path=/; HttpOnly
< set-cookie: guest_id=<LOTS OF RANDOM LETTERS>; Domain=.twitter.com; Path=/; Expires=Thu, 23-Apr-2015 11:47:21 UTC
< status: 200 OK
< strict-transport-security: max-age=123456789
< vary: Accept-Encoding
< x-frame-options: SAMEORIGIN
< x-mid: <LOTS OF RANDOM LETTERS>
< x-runtime: 0.04538
< x-transaction: <LOTS OF RANDOM LETTERS>
< x-xss-protection: 1; mode=block
<
* Connection #0 to host 123.456.7.89 left intact
Error: Proxy Authentication Required ( Forefront TMG requires authorization to fulfill the request. Access to the Web Proxy filter is denied. )