R readHTMLTable无法加载外部实体

时间:2015-06-14 21:56:26

标签: xml r connection

当我在笔记本电脑上运行这条线时,

table500 <- readHTMLTable('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')[[1]]

它获取数据。当我在桌面上运行它时,我收到错误

Error: failed to load external entity "http://en.wikipedia.org/wiki/List_of_S%26P_500_companies".

所以我猜这个问题与我桌面上的网络设置有关,但我没有丝毫想到它可能是什么。有什么建议吗?

1 个答案:

答案 0 :(得分:4)

在我在评论中提到的链接中,您可以使用RCurlhttr包找到解决方案。在这里,我使用rvest包提供解决方案。

   library(rvest)
    kk<-html("http://en.wikipedia.org/wiki/List_of_S%26P_500_companies")%>%
    html_table(fill=TRUE)%>%
    .[[1]] //table 1 only

head(kk)
  Ticker symbol            Security SEC filings            GICS Sector                GICS Sub Industry Address of Headquarters
1           MMM          3M Company     reports            Industrials         Industrial Conglomerates     St. Paul, Minnesota
2           ABT Abbott Laboratories     reports            Health Care Health Care Equipment & Services North Chicago, Illinois
3          ABBV              AbbVie     reports            Health Care                  Pharmaceuticals North Chicago, Illinois
4           ACN       Accenture plc     reports Information Technology   IT Consulting & Other Services         Dublin, Ireland
5           ACE         ACE Limited     reports             Financials    Property & Casualty Insurance     Zurich, Switzerland
6           ACT         Actavis plc     reports            Health Care                  Pharmaceuticals         Dublin, Ireland
  Date first added     CIK
1                    66740
2                     1800
3       2012-12-31 1551152
4       2011-07-06 1467373
5       2010-07-15  896159
6                   884629