通过R For循环创建数据帧

时间:2020-03-05 02:45:51

标签: r dataframe for-loop

R刚起步,因此非常感谢。

目标:我正在尝试在一个简短的脚本中创建数百个数据帧。它们遵循一种模式,所以我认为For循环就足够了,但是data.frame函数似乎忽略了变量的变量性质,在出现时将其读取。这是一个示例:

# Defining some dummy variables for the sake of this example
dfTitles <- c("C2000.AMY", "C2000.ACC", "C2001.AMY", "C2001.ACC") 
Copes <- c("Cope1", "Cope2", "Cope3", "Cope4")
Voxels <- c(1:338)

# (Theoretically) creating a separate dataframe for each of the terms in 'dfTitles'   
for (i in dfTitles){
 i <- data.frame(matrix(0, nrow = 4, ncol = 338, dimnames = list(Copes, Voxels)))
}

# Trying an alternative method
for (i in 1:length(dfTitles))
 {dfTitles[i] <- data.frame(matrix(0, nrow = 4, ncol = 338, dimnames = list(Copes, Voxels)))}

这将导致在前者中创建一个名为“ i”的数据帧,在后者的情况下会创建一个由4组成的列表。有任何想法吗?谢谢!


可能不必要的背景信息:我们正在使用fMRI数据进行分析,该分析将在刺激,大脑体素,大脑区域和参与者之间进行关联。我们正在关联整个矩阵,因此通过“参与者ID”和“大脑区域”将值(又称为COPE)分离为单独的数据帧将使下一步变得容易得多。在将数据加载并分类到一个大数据帧中之后,我已经尝试了下一步,这是一个很大的麻烦。

2 个答案:

答案 0 :(得分:0)

在for循环中创建对象时,需要在循环的下一次迭代之前将它们保存在某个位置,否则它将被覆盖。

一种处理方法是在循环开始之前用list创建一个空的c()或向量,并附加每次循环的输出。

另一种处理方法是在继续循环的下一次迭代之前将对象分配给您的环境。

# Defining some dummy variables for the sake of this example
dfTitles <- c("C2000.AMY", "C2000.ACC", "C2001.AMY", "C2001.ACC") 
Copes <- c("Cope1", "Cope2", "Cope3", "Cope4")
Voxels <- c(1:338)

# initialize a list to store the data.frame output
df_list <- list()
for (d in dfTitles) {
  # create data.frame with the dfTitle, and 1 row per Copes observation
  df <- data.frame(dfTitle = d,
                   Copes = Copes)
  # append columns for Voxels
  # setting to NA, can be reassigned later as needed
  for (v in Voxels) {
    df[[paste0("Voxel", v)]] <- NA
  }
  # store df in the list as the 'd'th element
  df_list[[d]] <- df
  # or, assign the object to your environment
  # assign(d, df)
}
# data.frames can be referenced by name
names(df_list)
head(df_list$C2000.AMY)

答案 1 :(得分:0)

rm(list=ls)
dfTitles <- c("C2000.AMY", "C2000.ACC", "C2001.AMY", "C2001.ACC") 
Copes <- c("Cope1", "Cope2", "Cope3", "Cope4")
Voxels <- c(1:3)

# (Theoretically) creating a separate dataframe for each of the terms in 'dfTitles'   
nr <- length(Voxels)
nc <- length(Copes)
N <- length(dfTitles) # Number of data frames, same as length of dfTitles

DF <- vector(N, mode="list")

for (i in 1:N){
  DF[[i]] <- data.frame(matrix(rnorm(nr*nc), nrow = nr))
  dimnames(DF[[i]]) <- list(Voxels, Copes)
}

names(DF) <- dfTitles
DF[1:2]

$C2000.AMY
       Cope1     Cope2      Cope3      Cope4
1 -0.8293164 -1.813807 -0.3290645 -0.7730110
2 -1.1965588  1.022871 -0.7764960 -0.3056280
3  0.2536782 -0.365232  2.3949076  0.5672671

$C2000.ACC
       Cope1    Cope2      Cope3      Cope4
1 -0.7505513 1.023325 -0.3110537 -1.4298174
2  1.2807725 1.216997  1.0644983  1.6374749
3  1.0047408 1.385460  0.1527678  0.1576037