我有216个数据框的列表,每个数据框有3个变量。例如:
df1 <- data.frame(A = 1:10, B= 11:20, C = 21:30)
df2 <- data.frame(A = 31:40, B = 41:50, C = 51:60)
listDF <- list(df1, df2)
我需要根据列表中的位置顺序重命名每个数据框中的变量。我能做到的。例如:
#create lists of the variable names
Bnames <- c("feel1", "feel2")
Cnames <- c("cat1", "cat2")
#sequentially name each data frame's columns
k <- 0
for(i in 1:length(listDF)){
k = k+1
names(listDF[[i]]) <- c("ID",Bnames[k],Cnames[k])
}
#I know people prefer lapply; I tend to switch back and forth depending on what I'm doing
我遇到的问题是,在216个数据框列表中(24个“cat”变量x 9个“感觉”变量= 216个),我需要“Bnames”和“Cnames”列表来排序率。我需要前9个数据帧有C = cat1,B = feel1:9,接下来的9有C = cat2,B = feel1:9,依此类推。所以我需要沿着B递归排序,但是每9个数据帧沿着C缓慢排序。每个数据框中的“A”应为“ID”。
我真的不知道如何做到这一点。提前感谢任何建议!
此外 - 如果有人建议更易理解的标题,我很乐意改变它。
修改
当我完成时,知道我想要在哪里结束可能会有所帮助。每个ID都存在于不同数量的数据帧中,最终我想要的是将数据帧重新整形并合并为1个数据帧,格式如下:
ID feel1.1 feel1.2 ... feel2.1 feel2.2
2 NA 4 NA 7
3 2 1 6 3
其中feel1.1表示“cat1”的“feel1”值,如果ID没有“feel”和“cat”的特定组合,则缺少值(因此ID 2没有“feel1”的值cat1但是为cat2做了。最终,应该有217列和尽可能多的行。
我的(不好)解决方案:
X <- listDF
#create lists of the data frame numbers for each "feel" variable
feel1 <- seq(1,216,by=9)
feel2 <- seq(2,216,by=9)
feel3 <- seq(3,216,by=9)
feel4 <- seq(4,216,by=9)
feel5 <- seq(5,216,by=9)
feel6 <- seq(6,216,by=9)
feel7 <- seq(7,216,by=9)
feel8 <- seq(8,216,by=9)
feel9 <- seq(9,216,by=9)
#assign correct names for the "feel" variables in each data frame
for(i in 1:length(X)){
if(i %in% feel1){
names(X[[i]]) <- c("UniqueID", "cat", "feel1")
}
if(i %in% feel2){
names(X[[i]]) <- c("UniqueID", "cat", "feel2")
}
if(i %in% feel3){
names(X[[i]]) <- c("UniqueID", "cat", "feel3")
}
if(i %in% feel4){
names(X[[i]]) <- c("UniqueID", "cat", "feel4")
}
if(i %in% feel5){
names(X[[i]]) <- c("UniqueID", "cat", "feel5")
}
if(i %in% feel6){
names(X[[i]]) <- c("UniqueID", "cat", "feel6")
}
if(i %in% feel7){
names(X[[i]]) <- c("UniqueID", "cat", "feel7")
}
if(i %in% feel8){
names(X[[i]]) <- c("UniqueID", "cat", "feel8")
}
if(i %in% feel9){
names(X[[i]]) <- c("UniqueID", "cat", "feel9")
}
}
#'melt' each of the dataframes and then remove the rows with 'cat'
X <- lapply(X, function(x) melt(x, id.vars ="UniqueID"))
X <- lapply(X, function(x) subset(x, variable != "cat"))
#add the appropriate 'cat' number to each 'feel' name
for(j in 1:length(X)){
if(j <= 9){
X[[j]]$variable <- paste0(X[[j]]$variable, ".24")
}
if(j > 9 & j <= 18){
X[[j]]$variable <- paste0(X[[j]]$variable, ".1")
}
if(j > 18 & j <= 27){
X[[j]]$variable <- paste0(X[[j]]$variable, ".2")
}
if(j > 27 & j <= 36){
X[[j]]$variable <- paste0(X[[j]]$variable, ".3")
}
if(j > 36 & j <= 45){
X[[j]]$variable <- paste0(X[[j]]$variable, ".4")
}
if(j > 45 & j <= 54){
X[[j]]$variable <- paste0(X[[j]]$variable, ".5")
}
if(j > 54 & j <= 63){
X[[j]]$variable <- paste0(X[[j]]$variable, ".6")
}
if(j > 63 & j <= 72){
X[[j]]$variable <- paste0(X[[j]]$variable, ".7")
}
if(j > 72 & j <= 81){
X[[j]]$variable <- paste0(X[[j]]$variable, ".8")
}
if(j > 81 & j <= 90){
X[[j]]$variable <- paste0(X[[j]]$variable, ".9")
}
if(j > 90 & j <= 99){
X[[j]]$variable <- paste0(X[[j]]$variable, ".10")
}
if(j > 99 & j <= 108){
X[[j]]$variable <- paste0(X[[j]]$variable, ".11")
}
if(j > 108 & j <= 117){
X[[j]]$variable <- paste0(X[[j]]$variable, ".12")
}
if(j > 117 & j <= 126){
X[[j]]$variable <- paste0(X[[j]]$variable, ".13")
}
if(j > 126 & j <= 135){
X[[j]]$variable <- paste0(X[[j]]$variable, ".14")
}
if(j > 135 & j <= 144){
X[[j]]$variable <- paste0(X[[j]]$variable, ".15")
}
if(j > 144 & j <= 153){
X[[j]]$variable <- paste0(X[[j]]$variable, ".16")
}
if(j > 153 & j <= 162){
X[[j]]$variable <- paste0(X[[j]]$variable, ".17")
}
if(j > 162 & j <= 171){
X[[j]]$variable <- paste0(X[[j]]$variable, ".18")
}
if(j > 171 & j <= 180){
X[[j]]$variable <- paste0(X[[j]]$variable, ".19")
}
if(j > 180 & j <= 189){
X[[j]]$variable <- paste0(X[[j]]$variable, ".20")
}
if(j > 189 & j <= 198){
X[[j]]$variable <- paste0(X[[j]]$variable, ".21")
}
if(j > 198 & j <= 207){
X[[j]]$variable <- paste0(X[[j]]$variable, ".22")
}
if(j > 207 & j <= 216){
X[[j]]$variable <- paste0(X[[j]]$variable, ".23")
}
}
#reshape each data frame into 2 columns: ID and the renamed 'feel' variable
X <- lapply(X, function(x) dcast(x, UniqueID ~ variable))
#merge it back onto the original dataset
for(i in 1:length(X)){
data <- merge(data, X[[i]], by="UniqueID", all=T)
}
答案 0 :(得分:0)
我将它们与list元素变量结合在一起,并从long变为宽格式。现在你只需要将一组变量名称更改为字符串列表(或字符串向量,我不确定),而不是列表元素中的许多名称列表。
# sample data
df1 <- data.frame(A = 1:10, B= 11:20, C = 21:30)
df2 <- data.frame(A = 5:14, B = 41:50, C = 51:60)
listDF <- list(df1, df2)
require(reshape)
require(plyr)
# put them all in 1 long dataframe
df <- rbind.fill(listDF)
# label which list element they came from and pretty up the vars
df$listnum <- rep((1:length(listDF)),times = lapply(listDF,nrow))
names(df) <- c('id','cat','feel','listnum')
# change from long to wide
df <- reshape(df,idvar = 'id',timevar = 'listnum',direction = 'wide')
我不完全确定您如何命名变量,但上述结果包含您需要的所有信息。您只需要做names(df) <- sub()
一点。并且以不同的速率循环,让R回收较短的矢量。类似的东西:
paste(rep(1:24,each = 9),1:9,sep = '.')