我一直在尝试使用R从Yahoo Finance抓取财务数据,但未能成功。您可以在下面查看我当前的代码。主要的问题似乎是,在Yahoo Finance中存储财务数据的表没有被建模为HTML代码中的表。我该如何绕过这个问题?
我已经尝试过复制似乎没有运气的Xpath。
library(XML)
symbol = "HD"
url <- paste('https://finance.yahoo.com/quote/HD/financials?p=',symbol,sep="")
webpage <- readLines(url)
html <- htmlTreeParse(webpage, useInternalNodes = TRUE, asText = TRUE)
tableNodes <- getNodeSet(html, "//table")
data <- readHTMLTable(tableNodes)
答案 0 :(得分:0)
我曾经使用过Yahoo Finance,您在犯一个小错误,因为tableNodes
可以包含多个表,因此请使用以下表获取所有表:
library(XML)
symbol = "HD"
url <- paste('https://finance.yahoo.com/quote/HD/analysts?p=',symbol,sep="")
webpage <- readLines(url)
html <- htmlTreeParse(webpage, useInternalNodes = TRUE, asText = TRUE)
tableNodes <- getNodeSet(html, "//table")
earningsEstimates <- readHTMLTable(tableNodes[[1]])
revenueEstimates <- readHTMLTable(tableNodes[[2]])
earningsHistory <- readHTMLTable(tableNodes[[3]])
earningPerShareTrend <- readHTMLTable(tableNodes[[4]])
earningPerShareRevision <- readHTMLTable(tableNodes[[5]])
growthEstimates <- readHTMLTable(tableNodes[[6]])
print(earningsEstimates) # printing one table
输出
Earnings Estimate Current Qtr. (Oct 2019) Next Qtr. (Jan 2020) Current Year (2020)
1 No. of Analysts 28 28 35
2 Avg. Estimate 2.52 2.17 10.13
3 Low Estimate 2.47 2.07 10.03
4 High Estimate 2.58 2.24 10.27
5 Year Ago EPS 2.51 2.25 9.89
Next Year (2021)
1 35
2 10.96
3 10.7
4 11.2
5 10.13