我在阅读以下链接的大学时遇到了麻烦。
http://www.usnews.com/education/best-global-universities/rankings
我试过
readHTMLTable("http://www.usnews.com/education/best-global-universities/rankings")
....但它不起作用。
我只需要将页面中间的大学排名读入R。
答案 0 :(得分:2)
作为首发者:
library(XML)
doc <- htmlParse("http://www.usnews.com/education/best-global-universities/rankings")
res <- xpathApply(doc, "//div[@class='sep']", getChildrenStrings)
data.frame(uni = gsub("\\s\\s+", " ", gsub("[\n\t\r]", "", sapply(res, "[", 6))),
score = as.numeric(gsub("[^0-9.]", "", sapply(res, "[", 2))))
# uni score
# 1 Harvard University United States Cambridge, Massachusetts 100.0
# 2 Massachusetts Institute of Technology United States Cambridge, Massachusetts 88.9
# 3 University of California--Berkeley United States Berkeley, California 88.0
# 4 Stanford University United States Stanford, California 85.1
# 5 University of Oxford United Kingdom Oxford 83.6
# 6 University of Cambridge United Kingdom Cambridge 83.3
# 7 California Institute of Technology United States Pasadena, California 80.3
# 8 University of California--Los Angeles United States Los Angeles, California 80.1
# 9 University of Chicago United States Chicago, Illinois 77.4
# 10 Columbia University United States New York, New York 77.3