网络抓取仅从网页中提取表值

时间:2019-11-21 09:58:13

标签: r rvest

我只想从以下链接中提取表值。 url <-“ https://www.ds-norden.com/drycargo/fleetlist/

我正在尝试以下代码,但没有得到想要的输出

library(rvest)
url <- "https://www.scorpiobulkers.com/our-fleet/"
webpage<-read_html(url)
rank_data_html<- html_node(webpage,".col-main")
rank_data<-html_text(rank_data_html)
head(rank_data)

从这段代码中,我得到了网页的全文。我只想要舰队列表,该列表在网页表中并将其存储为df在R中。

1 个答案:

答案 0 :(得分:1)

library(rvest)

url <- "https://www.scorpiobulkers.com/our-fleet/"
webpage<-read_html(url)

rank_data <- 
  webpage %>% 
  html_node("table") %>% 
  html_table()

head(rank_data)
#>      Vessel Name Year Built (1) Yard (2) Vessel Type
#> 1 NA   SBI Bravo           2015    Nacks    Ultramax
#> 2 NA  SBI Athena           2015  Chengxi    Ultramax
#> 3 NA SBI Antares           2015    Nacks    Ultramax
#> 4 NA  SBI Cronos           2015  Imabari    Ultramax
#> 5 NA     SBI Leo           2015    Dacks    Ultramax
#> 6 NA    SBI Echo           2015  Imabari    Ultramax