我试图获取列表元素的名称,并使用do()将函数应用于它们,然后将它们绑定在一个数据框中。
require(XML)
require(magrittr)
url <- "http://gd2.mlb.com/components/game/mlb/year_2016/month_05/day_21/gid_2016_05_21_milmlb_nynmlb_1/boxscore.xml"
box <- xmlParse(url)
xml_data <- xmlToList(box)
end <- length(xml_data[[2]]) - 1
x <- seq(1:end)
away_pitchers_names <- paste0("xml_data[[2]][", x, "]")
away_pitchers_names <- as.data.frame(away_pitchers_names)
names(away_pitchers_names) <- "elements"
away_pitchers_names$elements %<>% as.character()
listTodf <- function(x) {
df <- as.data.frame(x)
tdf <- as.data.frame(t(df))
row.names(tdf) <- NULL
tdf
}
test <- away_pitchers_names %>% group_by(elements) %>% do(listTodf(.$elements))
当我在列表元素上运行listTodf
函数时,它可以正常工作:
listTodf(xml_data[[2]][1]
id name name_display_first_last pos out bf er r h so hr bb np s w l sv bs hld s_ip s_h s_r s_er s_bb
1 605200 Davies Zach Davies P 16 22 4 4 5 5 2 2 86 51 1 3 0 0 0 36.0 41 24 23 15
s_so game_score era
1 25 45 5.75
但是当我尝试使用do()函数循环遍历元素的名称时,我得到以下内容:
Warning message:
In rbind_all(out[[1]]) : Unequal factor levels: coercing to character
这是输出:
> test
Source: local data frame [5 x 2]
Groups: elements [5]
elements V1
(chr) (chr)
1 xml_data[[2]][1] xml_data[[2]][1]
2 xml_data[[2]][2] xml_data[[2]][2]
3 xml_data[[2]][3] xml_data[[2]][3]
4 xml_data[[2]][4] xml_data[[2]][4]
5 xml_data[[2]][5] xml_data[[2]][5]
我确信这是非常简单的事情,但我无法弄清楚事情会被绊倒。
答案 0 :(得分:1)
为了评估字符串,可以使用eval(parse
library(dplyr)
lapply(away_pitchers_names$elements,
function(x) as.data.frame.list(eval(parse(text=x))[[1]], stringsAsFactors=FALSE)) %>%
bind_rows()
# id name name_display_first_last pos out bf er r h so hr bb np s w l
#1 605200 Davies Zach Davies P 16 22 4 4 5 5 2 2 86 51 1 3
#2 430641 Boyer Blaine Boyer P 2 4 0 0 2 0 0 0 8 7 1 0
#3 448614 Torres, C Carlos Torres P 3 4 0 0 0 1 0 2 21 11 0 1
#4 592804 Thornburg Tyler Thornburg P 3 3 0 0 0 1 0 0 14 8 2 1
#5 518468 Blazek Michael Blazek P 1 5 1 1 2 0 0 2 23 10 1 1
# sv bs hld s_ip s_h s_r s_er s_bb s_so game_score era loss note
#1 0 0 0 36.0 41 24 23 15 25 45 5.75 <NA> <NA>
#2 0 1 0 21.1 22 4 4 5 7 48 1.69 <NA> <NA>
#3 0 0 2 22.1 22 9 9 14 21 52 3.63 <NA> <NA>
#4 1 2 8 18.2 13 8 8 7 29 54 3.86 <NA> <NA>
#5 0 1 8 21.1 23 6 6 14 18 41 2.53 true (L, 1-1)
然而,只做
更容易,更快捷lapply(xml_data[[2]][1:5], function(x)
as.data.frame.list(x, stringsAsFactors=FALSE)) %>%
bind_rows()