我想使用R从网址抓取新闻(http://www.foxnews.com/search-results/search?q=“AlphaGo”& ss = fn& start = 0)。这是我的代码:
url <- "http://api.foxnews.com/v1/content/search?q=%22AlphaGo%22&fields=date,description,title,url,image,type,taxonomy§ion.path=fnc&start=0&callback=angular.callbacks._0&cb=2017719162"
html <- str_c(readLines(url,encoding = "UTF-8"),collapse = "")
content_fox <- RJSONIO:: fromJSON(html)
然而,json无法理解为出现错误:
文件错误(con,“r”):无法打开连接
我注意到json从angular.callbacks._0
开始,我认为这可能是问题所在。
知道怎么解决这个问题吗?
答案 0 :(得分:0)
根据Parse JSONP with R中的答案,我用两个新代码调整了我的代码并且它有效:
url <- "http://api.foxnews.com/v1/content/search?q=%22AlphaGo%22&fields=date,description,title,url,image,type,taxonomy§ion.path=fnc&start=0&callback=angular.callbacks._0&cb=2017719162"
html <- str_c(readLines(url,encoding = "UTF-8"),collapse = "")
html <- sub('[^\\{]*', '', html) # remove function name and opening parenthesis
html <- sub('\\)$', '', html) # remove closing parenthesis
content_fox <- RJSONIO:: fromJSON(html)