R - 抓取时没有数据

时间:2021-03-07 14:17:04

标签: r web-scraping rvest

我遇到了与此处列出的相同的问题;但是,我使用的是 R 而不是 python:Getting no data when scraping a table

我正在尝试从 https://coinmarketcap.com/currencies/bitcoin/historical-data/

中抓取历史比特币数据

这是我运行的代码。 url_tables 结果为 0 的列表。

1   -   2

任何帮助将不胜感激。我有点菜鸟。

1 个答案:

答案 0 :(得分:0)

这里我使用 Selenium 来控制 Firefox 浏览器。 请注意,您必须安装 Selenium 并将其添加到操作系统的 PATH 中才能正常工作。请参阅此处了解详情:https://www.selenium.dev/documentation/en/webdriver/driver_requirements/

library(tidyverse)
library(rvest)
library(RSelenium)
    
url<-'https://coinmarketcap.com/currencies/bitcoin/historical-data/'

rD <- rsDriver(browser = "firefox", port=4545L, verbose=TRUE) #If port 4525 does not work, simply change it to 4546, 4547...
remDr <- rD[["client"]]

remDr$navigate(url = url) #This should open a Firefox instance in a new window.
    
#capture html
obj_html<-remDr$getPageSource()[[1]] %>% read_html(encoding = "UTF-8")

#table extraction
table <- obj_html %>% html_nodes(xpath = "//div/table") %>% html_table(fill = TRUE) %>% as.data.frame() %>% as_tibble()


# A tibble: 58 x 7
   Date     Open.   High    Low     Close.. Volume    Market.Cap 
   <chr>    <chr>   <chr>   <chr>   <chr>   <chr>     <chr>      
 1 Mar 06,~ $48,89~ $49,14~ $47,25~ $48,91~ $34,363,~ $912,054,1~
 2 Mar 05,~ $48,52~ $49,39~ $46,54~ $48,92~ $48,625,~ $912,285,0~
 3 Mar 04,~ $50,52~ $51,73~ $47,65~ $48,56~ $52,343,~ $905,414,1~
 4 Mar 03,~ $48,41~ $52,53~ $48,27~ $50,53~ $53,220,~ $942,236,5~
 5 Mar 02,~ $49,61~ $50,12~ $47,22~ $48,37~ $47,530,~ $901,933,6~
 6 Mar 01,~ $45,15~ $49,78~ $45,11~ $49,63~ $53,891,~ $925,235,5~
 7 Feb 28,~ $46,19~ $46,71~ $43,24~ $45,13~ $53,443,~ $841,428,9~
 8 Feb 27,~ $46,34~ $48,25~ $45,26~ $46,18~ $45,910,~ $860,978,1~
 9 Feb 26,~ $47,18~ $48,37~ $44,45~ $46,33~ $350,967~ $863,752,2~
10 Feb 25,~ $49,70~ $51,94~ $47,09~ $47,09~ $54,506,~ $877,766,1~
# ... with 48 more rows