team.df
中的每一行都包含一个NBA team。 list.of.all.stars
中的每个数据框都包含多行,具体取决于与每个NBA球队相关联的all star players的数量。
使用apply()
系列功能,如何扩展team.df
中的行数,以增加每个团队所有明星玩家的数量和合并来自list.of.all.stars
到最终输出?
我对非apply()
方法完全开放,只是想给出一个我希望避免编写循环的例子。
以下是我想要的输出:
# Team_Name Team_Location Player Captain
# 1 Cavaliers Cleveland, OH LeBron James TRUE
# 2 Cavaliers Cleveland, OH Kevin Love FALSE
# 3 Warriors Oakland, CA Stephen Curry TRUE
# 4 Warriors Oakland, CA Kevin Durant FALSE
# 5 Warriors Oakland, CA Klay Thompson FALSE
# 6 Warriors Oakland, CA Draymond Green FALSE
# create data frame
# about team information
team.df <-
data.frame(
Team_Name = c( "Cavaliers", "Warriors" )
, Team_Location = c( "Cleveland, OH", "Oakland, CA")
, stringsAsFactors = FALSE
)
# create list about
# all stars on each team
list.of.all.stars <-
list(
data.frame(
Player = c( "LeBron James", "Kevin Love" )
, Captain = c( TRUE, FALSE )
, stringsAsFactors = FALSE
)
, data.frame(
Player = c( "Stephen Curry", "Kevin Durant"
, "Klay Thompson", "Draymond Green"
)
, Captain = c( TRUE, FALSE, FALSE, FALSE )
, stringsAsFactors = FALSE
)
)
# cbind each data frame within the list.of.all.stars
# to its corresponding row in team.df
team.and.all.stars.list.of.df <-
list(
cbind(
df[ 1, ]
, list.of.all.stars[[1]]
)
, cbind(
df[ 2, ]
, list.of.all.stars[[2]]
)
)
# Warning messages:
# 1: In data.frame(..., check.names = FALSE) :
# row names were found from a short variable and have been discarded
# 2: In data.frame(..., check.names = FALSE) :
# row names were found from a short variable and have been discarded
# collapse each list
# into data frame
final.df <-
data.frame(
do.call(
what = "rbind"
, args = team.and.all.stars.list.of.df
)
, stringsAsFactors = FALSE
)
# view final output
final.df
# Team_Name Team_Location Player Captain
# 1 Cavaliers Cleveland, OH LeBron James TRUE
# 2 Cavaliers Cleveland, OH Kevin Love FALSE
# 3 Warriors Oakland, CA Stephen Curry TRUE
# 4 Warriors Oakland, CA Kevin Durant FALSE
# 5 Warriors Oakland, CA Klay Thompson FALSE
# 6 Warriors Oakland, CA Draymond Green FALSE
# end of script #
# Hoping to Apply A Function
# using a data frame and
# a list of data frames
mapply.method <-
mapply(
FUN = function( x, y )
cbind.data.frame(
x
, y
, stringsAsFactors = FALSE
)
, team.df
, list.of.all.stars
)
# view results
mapply.method
# Team_Name Team_Location
# x Character,2 Character,4
# Player Character,2 Character,4
# Captain Logical,2 Logical,4
# end of script #
答案 0 :(得分:3)
考虑到问题的编辑和所需的输出,我会纯粹使用data.table
library(data.table)
## combine the list of all stars into one data.table
## creating an 'id' column
dt_players <- rbindlist(list.of.all.stars, idcol = T)
## we can keep/use the row names as the order of the data
## is consistent with the list elements
dt_teams <- as.data.table(team.df, keep.rownames = T)
dt_teams[, rn := as.integer(rn)]
## use a join to combine the data to get the desired result.
dt_teams[
dt_players
, on = c(rn = ".id")
]
# rn Team_Name Team_Location Player Captain
# 1: 1 Cavaliers Cleveland, OH LeBron James TRUE
# 2: 1 Cavaliers Cleveland, OH Kevin Love FALSE
# 3: 2 Warriors Oakland, CA Stephen Curry TRUE
# 4: 2 Warriors Oakland, CA Kevin Durant FALSE
# 5: 2 Warriors Oakland, CA Klay Thompson FALSE
# 6: 2 Warriors Oakland, CA Draymond Green FALSE
此方法使用data.table
执行实际工作,但我已经为您提供了sapply
方法,用于获取展开team.df
数据框的行数。
它还假设team.df
中的团队顺序与list.of.all.starts
内的玩家顺序一致(即data.frame
的行对应列表元素)
library(data.table)
## grab the rows of each data.frame
reps <- sapply(list.of.all.stars, nrow)
## replace the rows of the data.frame
setDT(team.df)[rep(1:.N, reps), ]
# Team_Name Team_Location
# 1: Cavaliers Cleveland, OH
# 2: Cavaliers Cleveland, OH
# 3: Warriors Oakland, CA
# 4: Warriors Oakland, CA
# 5: Warriors Oakland, CA
# 6: Warriors Oakland, CA
如果您不想使用data.table
,可以将相同的方法应用于data.frame
team.df[rep(row.names(team.df), reps), ]
# Team_Name Team_Location
# 1 Cavaliers Cleveland, OH
# 1.1 Cavaliers Cleveland, OH
# 2 Warriors Oakland, CA
# 2.1 Warriors Oakland, CA
# 2.2 Warriors Oakland, CA
# 2.3 Warriors Oakland, CA
或使用类似的概念,但都在lapply
lst <- lapply(seq_along(list.of.all.stars), function(x) {
df <- team.df[x, ]
df[rep(row.names(df), nrow(list.of.all.stars[[x]])), ]
})
do.call(rbind, lst)
# Team_Name Team_Location
# 1 Cavaliers Cleveland, OH
# 1.1 Cavaliers Cleveland, OH
# 2 Warriors Oakland, CA
# 2.1 Warriors Oakland, CA
# 2.2 Warriors Oakland, CA
# 2.3 Warriors Oakland, CA
答案 1 :(得分:3)
关于OP在Map/mapply
'team.df'中使用'team.df'作为输入的方法是data.frame
,这是list
列。因此,基本输入是vector
列。它遍历vector
或列而不是整个数据集或行(基于所需的输出)。为了防止这种情况,如果我们用list
换行,它就是一个单元,它会循环到'list.of.all.stars'的每个list
元素
do.call(rbind, Map(cbind, list(team.df), list.of.all.stars))
根据预期的输出,'team.df'的每一行都应该有'list.of.all.stars'的相应list
元素。在这种情况下,行split
'team.df'并执行cbind
res <- do.call(rbind, Map(cbind, split(team.df, seq_len(nrow(team.df))), list.of.all.stars))
row.names(res) <- NULL
res
# Team_Name Team_Location Player Captain
#1 Cavaliers Cleveland, OH LeBron James TRUE
#2 Cavaliers Cleveland, OH Kevin Love FALSE
#3 Warriors Oakland, CA Stephen Curry TRUE
#4 Warriors Oakland, CA Kevin Durant FALSE
#5 Warriors Oakland, CA Klay Thompson FALSE
#6 Warriors Oakland, CA Draymond Green FALSE
我们也可以在tidyverse
中执行此操作。在对'team.df'中的所有列进行分组后,nest
将其创建一个'数据'的基本列表(长度为2),将'data'分配给'list.of.all.stars 'mutate
和unnest
list
library(tidyverse)
team.df %>%
group_by_all() %>%
nest %>%
mutate(data = list.of.all.stars) %>%
unnest
# A tibble: 6 x 4
# Team_Name Team_Location Player Captain
# <chr> <chr> <chr> <lgl>
# 1 Cavaliers Cleveland, OH LeBron James T
# 2 Cavaliers Cleveland, OH Kevin Love F
# 3 Warriors Oakland, CA Stephen Curry T
# 4 Warriors Oakland, CA Kevin Durant F
# 5 Warriors Oakland, CA Klay Thompson F
# 6 Warriors Oakland, CA Draymond Green F