提供data.frame()
var1.gender
var1.score.raw
var1.score.raw.lower
var1.score.raw.upper
[...]
var2.gender
var2.score.raw
var2.score.raw.lower
var2.score.raw.upper
[...]
如何将此转换为多维列表,按.
拆分?
示例数据:
df <- data.frame('var1.gender' = c(1,1,3,3), 'var1.score.raw' = c(12.3, 12.4, 14.5, 13.2), 'var1.score.raw.lower' = c(11,11,13,12), 'var1.score.raw.upper' = c(13,13,15,14), 'var2.gender' = c(1,1,3,3), 'var2.score.raw' = c(12.3, 12.4, 14.5, 13.2), 'var2.score.raw.lower' = c(11,11,13,12), 'var2.score.raw.upper' = c(13,13,15,14))
结果列表应如下所示:
$var1
$var1$gender
[1] 1 1 3 3
$var1$score
$var1$score$raw
[1] 12.3 12.4 14.5 13.2
$var1$score$lower
[1] 11 11 13 12
$var1$score$upper
[1] 13 13 15 14
$var2
$var2$gender
[1] 1 1 3 3
$var2$score
$var2$score$raw
[1] 12.3 12.4 14.5 13.2
$var2$score$lower
[1] 11 11 13 12
$var2$score$upper
[1] 13 13 15 14
答案 0 :(得分:1)
顺便说一下,“df”构建一个简单的构建通缉列表的方法是为每个“df”列评估list[["X"]][["Y"]][["Z"]][...] = df$X.Y.Z...
之类的调用。这可以通过操纵“语言”对象来动态完成。
定义一个接受列表的函数,一个名称/索引的字符向量和一个要在该级别分配的值,我们有:
assign_list_element = function(x, inds, val)
{
cl = bquote(x[[.(inds[1])]])
for(s in inds[-1]) cl = bquote(.(cl)[[.(s)]])
cl = call("<-", cl, bquote(.(val)))
print(cl); flush.console()
eval(cl)
return(x)
}
某些bquote
次调用可以更简单或用substitute
替换,但是,如上所述使用它构建一个关于索引的更好的格式化调用(用于打印)。
然后,对于“df”的每一列,重新构造一个-at start empty-list:
nms = strsplit(names(df), ".", TRUE)
l = list()
for(i in seq_along(nms)) l = assign_list_element(l, nms[[i]], df[[i]])
#x[["var1"]][["gender"]] <- c(1, 1, 3, 3)
#x[["var1"]][["score"]][["raw"]] <- c(12.3, 12.4, 14.5, 13.2)
#x[["var1"]][["score"]][["lower"]] <- c(11, 11, 13, 12)
#x[["var1"]][["score"]][["upper"]] <- c(13, 13, 15, 14)
#x[["var2"]][["gender"]] <- c(1, 1, 3, 3)
#x[["var2"]][["score"]][["raw"]] <- c(12.3, 12.4, 14.5, 13.2)
#x[["var2"]][["score"]][["lower"]] <- c(11, 11, 13, 12)
#x[["var2"]][["score"]][["upper"]] <- c(13, 13, 15, 14)
str(l)
#List of 2
# $ var1:List of 2
# ..$ gender: num [1:4] 1 1 3 3
# ..$ score :List of 3
# .. ..$ raw : num [1:4] 12.3 12.4 14.5 13.2
# .. ..$ lower: num [1:4] 11 11 13 12
# .. ..$ upper: num [1:4] 13 13 15 14
# $ var2:List of 2
# ..$ gender: num [1:4] 1 1 3 3
# ..$ score :List of 3
# .. ..$ raw : num [1:4] 12.3 12.4 14.5 13.2
# .. ..$ lower: num [1:4] 11 11 13 12
# .. ..$ upper: num [1:4] 13 13 15 14
使用这种方法,列表在每次迭代时都会重新构建,但不会复制其元素。
答案 1 :(得分:0)
我会稍后使用查看列名称中的句点(更复杂)来编辑它,但是,如果没有自动化,您可以创建嵌套列表,如下所示:
df <- data.frame('var1.gender' = c(1,1,3,3), 'var1.score.raw' = c(12.3, 12.4, 14.5, 13.2), 'var1.score.raw.lower' = c(11,11,13,12), 'var1.score.raw.upper' = c(13,13,15,14), 'var2.gender' = c(1,1,3,3), 'var2.score.raw' = c(12.3, 12.4, 14.5, 13.2), 'var2.score.raw.lower' = c(11,11,13,12), 'var2.score.raw.upper' = c(13,13,15,14))
df
# changed your naming here to remove the not-needed ".raw."
colnames(df) <- c("var1.gender", "var1.score.raw", "var1.score.lower", "var1.score.upper", "var2.gender", "var2.score.raw", "var2.score.lower", "var2.score.upper")
nested <- with(df, expr = {list(var1 = list(gender = var1.gender,
score = list(raw = var1.score.raw,
lower = var1.score.lower,
upper = var1.score.upper)),
var2 = list(gender = var2.gender,
score = list(raw = var2.score.raw,
lower = var2.score.lower,
upper = var2.score.upper)))})
nested
$var1
$var1$gender
[1] 1 1 3 3
$var1$score
$var1$score$raw
[1] 12.3 12.4 14.5 13.2
$var1$score$lower
[1] 11 11 13 12
$var1$score$upper
[1] 13 13 15 14
$var2
$var2$gender
[1] 1 1 3 3
$var2$score
$var2$score$raw
[1] 12.3 12.4 14.5 13.2
$var2$score$lower
[1] 11 11 13 12
$var2$score$upper
[1] 13 13 15 14
尝试制作动态版本但却迷失了对递归的思考。无论如何,如果你扩展数据集中的varX数量,这可能会有效。它不像手工做的那么干净,而且还有一个空的列表。
nester <- function(df, splitby = "."){
separated <- strsplit(colnames(df), paste0("[", splitby, "]"))
# in order to rbind this into a matrix, we have to make all vectors the same length
n <- max(rapply(separated, length))
separated <- do.call(rbind, rapply(separated, function(x) {length(x) <- n; x }, how = "replace"))
separated <- ifelse(is.na(separated), "empty", separated)
listnames <- apply(separated, 2, unique)
L <- list()
# Assumes n is 3.
for(L1 in listnames[[1]]){
L[[L1]] <- list() # create List level 1
for(L2 in listnames[[2]]){
L[[L1]][[L2]] <- list() # create List level 2
for(L3 in listnames[[3]]){
L[[L1]][[L2]][[L3]] <- list() # create list level 3
# If no data exists for that list combination ...
if(length(df[,which(separated[,1] == L1 & separated[,2] == L2 & separated[,3] == L3)]) == 0){
L[[L1]][[L2]][[L3]] <- NULL # then remove that nested list.
} else {
# otherwise go ahead and put that column in as a list
L[[L1]][[L2]][[L3]] <- df[,which(separated[,1] == L1 & separated[,2] == L2 & separated[,3] == L3)]
# if data is sitting in a list$empty ...
if( L3 == "empty" ){
z <- unname(unlist(L[[L1]][[L2]][[L3]]))
L[[L1]][[L2]][[L3]] <- as.vector(z) # save the empty L3 to the L2
#L[[L1]][[L2]][[L3]] <- NULL # and delete the L3
}
}
}
}
}
return(L)
}
df.List <- nester(df, splitby = ".")
df.List