我正在使用R,我想从网址获取JSON信息,我有大约5000个用户代理发送到此API(http://www.useragentstring.com/pages/api.php)
我使用此代码创建网址并连接用户代理:
url_1<-paste(" \"http://www.useragentstring.com/?uas=",uaelenchi[11,1],"&getJSON=all\"",sep = '');
json_data2<-fromJSON(readLines(cat(url_1)))
但是我收到了这个错误:
Error in readLines(cat(url_1)) : 'con' is not a connection
任何建议都会非常感激!感谢
答案 0 :(得分:1)
我使用rjson::fromJSON(file = paste(your_url))
。如果你做了一个可重复的例子,我可以检查它是否适用于你的情况。
答案 1 :(得分:0)
library(httr)
library(jsonlite)
library(purrr)
uas <- c("Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:17.0) Gecko/20100101 Firefox/17.0",
"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:17.0) Gecko/20100101 Firefox/17.0",
"Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.6 Safari/537.11",
"Mozilla/5.0 (X11; OpenBSD amd64; rv:28.0) Gecko/20100101 Firefox/28.0",
"Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.6 Safari/537.11",
"Mozilla/5.0 (X11; OpenBSD amd64; rv:28.0) Gecko/20100101 Firefox/28.0",
"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:14.0) Gecko/20120405 Firefox/14.0a1",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1944.0 Safari/537.36",
"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:14.0) Gecko/20120405 Firefox/14.0a1",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1944.0 Safari/537.36")
parse_uas <- function(uas) {
res <- GET("http://www.useragentstring.com/", query=list(uas=uas, getJSON="all"))
stop_for_status(res)
content(res, as="text", encoding="UTF-8") %>%
fromJSON(res, flatten=TRUE) %>%
as.data.frame(stringsAsFactors=FALSE)
}
map_df(uas, parse_uas)
要保存API调用,您应该向parse_uas()
函数添加一个缓存层,这可以通过memoise
包很容易地完成:
library(memoise)
.parse_uas <- function(uas) {
res <- GET("http://www.useragentstring.com/", query=list(uas=uas, getJSON="all"))
stop_for_status(res)
content(res, as="text", encoding="UTF-8") %>%
fromJSON(res, flatten=TRUE) %>%
as.data.frame(stringsAsFactors=FALSE)
}
parse_uas <- memoise(.parse_uas)
另外,如果您使用的是Linux,那么您也可以尝试this package(它在macOS上编译得不好,而在Windows IIRC上完全没编译),这将在本地进行所有处理。