如何使用嵌套for循环创建多个数据帧

时间:2018-05-23 07:09:53

标签: r

数据集标记

X <- c("vijay","raj","joy")

Y <- c("maths","eng","science","social","hindi","physical","sanskrit")    

df <- list()

for (i in X){
  for (j in Y)
  {

    df <- data.frame(subset(marks, name == i & subject == j))
  }
}

这里我想要创建具有针对每个学生的所有主题标记的子集。因此,我们希望有3 X 7个子集。 但是我写的代码给了我一个子集。怎么解决这个问题?

2 个答案:

答案 0 :(得分:3)

我们可以使用expand.grid创建所有组合,然后循环遍历数据集的行,并subset'标记'以获得list data.frame

dat <- expand.grid(X, Y, stringsAsFactors = FALSE)
lst <- apply(dat, 1, function(x) subset(marks, name == x[1] & subject == x[2]))

或使用tidyverse

library(tidyverse)
crossing(X, Y) %>%
   pmap(~ marks %>%
             filter(name == ..1, subject == ..2))

数据

set.seed(24)
marks <- data.frame(name = sample(X, 100, replace = TRUE), 
  subject = sample(Y, 100, replace = TRUE), stringsAsFactors = FALSE)

答案 1 :(得分:3)

您可以使用outer(),但必须对内部函数进行矢量化:

X <- c("vijay","raj","joy")
Y <- c("maths","eng","science","social","hindi","physical","sanskrit")
set.seed(24)
marks <- data.frame(name = sample(X, 100, replace = TRUE), 
                    subject = sample(Y, 100, replace = TRUE), stringsAsFactors = FALSE)

sset <- function(x,y) subset(marks, name == x & subject == y)    
L <- outer(X, Y, FUN=Vectorize(sset, SIMPLIFY=FALSE))
L[1,1]

对象L是数据框的矩阵 以下是使用双lapply()的另一种解决方案:

L2 <- lapply(X, function(x) lapply(Y, function(y) subset(marks, name == x & subject == y)))

对象L2是列表清单 以下是for循环的变体:

df <- vector("list", length(X)*length(Y))
l <- 1

for (i in X)  for (j in Y) {
  df[[l]] <- subset(marks, name == i & subject == j)
  l <- l+1
}

仅针对现有级别进行子集化,您只需使用split()

即可
L3 <- split(marks, list(marks$name, marks$subject))

objekt L3是一个数据框列表。