我在R中编写了以下代码,其中我想从this particular webpage获取一些名称。
library(RCurl)
library(XML)
x <- getURL("http://www.encyclopedia-titanica.org/titanic-passengers-crew-lived/country-17/england.html")
x_2 <- htmlParse(x)
x_3 <- readHTMLTable(x_2)
但是,每当我查看x_3的内容时,我都会得到以下内容......
x_3
named list()
似乎readHTMLTable函数无法获取表。任何人都可以帮助我从这个网页获取乘客的名字,而无需复制和粘贴?非常感激。
答案 0 :(得分:0)
library(rvest)
library(dplyr)
base <- "http://www.encyclopedia-titanica.org/titanic-passengers-crew-lived/country-17/england.html"
# I use the older rvest package...`html` might be `read_html` now.Link to git repo below:
# https://github.com/hadley/rvest/blob/7d65d84e013b1bb3827ae0a2e05ddaed4875c112/R/parse.R
data_df <- (html(base) %>% html_table)[[1]]
knitr::kable(summary(data_df))
| | Name | Age | Class/Dept | Ticket | Joined | Job |Boat [Body] | |
|:--|:----------------|:----------------|:----------------|:----------------|:----------------|:----------------|:----------------|:------------|
| |Length:1190 |Length:1190 |Length:1190 |Length:1190 |Length:1190 |Length:1190 |Length:1190 |Mode:logical |
| |Class :character |Class :character |Class :character |Class :character |Class :character |Class :character |Class :character |NA's:1190 |
| |Mode :character |Mode :character |Mode :character |Mode :character |Mode :character |Mode :character |Mode :character |NA |