r谷歌搜索结果计数检索

时间:2015-05-12 18:50:16

标签: r

使用关键字"健康医院"搜索谷歌返回约1,150,000,000个结果。如何在R?中以编程方式获得此计数?

我见过这个lin k,他们试图用Java解决它。如何在R中完成?一个示例代码片段将不胜感激。

感谢。

1 个答案:

答案 0 :(得分:5)

仅修改在BioBucket博客文章中找到的一行代码:Get No. of Google Search Hits with R and XML

GoogleHits <- function(input)
   {
    require(XML)
    require(RCurl)
    url <- paste("https://www.google.com/search?q=",
                 input, sep = "") # modified line      
    CAINFO = paste(system.file(package="RCurl"), "/CurlSSL/ca-bundle.crt", sep = "")
    script <- getURL(url, followlocation = TRUE, cainfo = CAINFO)
    doc <- htmlParse(script)
    res <- xpathSApply(doc, '//*/div[@id="resultStats"]', xmlValue)
    cat(paste("\nYour Search URL:\n", url, "\n", sep = ""))
    cat("\nNo. of Hits:\n") # get rid of cat text if not wanted
    return(as.integer(gsub("[^0-9]", "", res)))
   }

# Example:
no.hits <- GoogleHits("health%20hospital")
#Your Search URL:
#https://www.google.com/search?q=health%20hospital
#
#No. of Hits:
no.hits
#[1] 1170000000

我从

更改了url作业
url <- paste("https://www.google.com/search?q=\"", input, "\"", sep = "")

url <- paste("https://www.google.com/search?q=", input, sep = "")