将具有垂直关系的data.table转换为具有水平关系的表

时间:2017-07-30 20:52:32

标签: r data.table

如何优雅地将table1(具有垂直关系)传递给table2(具有水平关系)?

library(data.table)
# vertical relationship
table1 <- data.table(id=1:4,father=c(NA,"Vladimir","Boris","John"),individual=c("Vladimir","Boris","John","Will"))
table1

enter image description here

# horizontal relationship
table2 <- data.table(id=1:4,greatgrandfather= c(NA,NA,NA,"Vladimir"), grandfather=c(NA,NA,"Vladimir","Boris"),father=c(NA,"Vladimir","Boris","John"),individual=c("Vladimir","Boris","John","Will"))
table2

enter image description here

这就是我提出的丑陋解决方案:

# ugly solution
temporary.table <- table1[,.(father,individual)]

names(temporary.table)<- c("grandfather","father")
new.table <- merge(table1,temporary.table,by="father",all.x=T)

names(temporary.table)<- c("greatgrandfather","grandfather")
new.table <- merge(new.table,temporary.table,by="grandfather",all.x=T)

ugly.solution <- new.table[order(id)][,.(id,greatgrandfather,grandfather,father,individual)]
ugly.solution

enter image description here

2 个答案:

答案 0 :(得分:2)

1)Reduce 定义查找其参数父亲的father_of函数。还要定义nms,即输出的列名称("id"除外)。

然后使用Reduce递归应用father_of

最后将所有内容放在一个数据表中。

请注意,只需修改nms,我们就可以在结果中拥有更多或更少的祖先。

father_of <- function(x, ...) table1[, father[match(x, individual)] ]

nms <- c("greatgrandfather", "grandfather", "father", "individual")

r <- Reduce(father_of, init = table1$individual, nms[-1], acc = TRUE)    

table1[, c(.(id = id), setNames(rev(r), nms))]

,并提供:

   id greatgrandfather grandfather   father individual
1:  1               NA          NA       NA   Vladimir
2:  2               NA          NA Vladimir      Boris
3:  3               NA    Vladimir    Boris       John
4:  4         Vladimir       Boris     John       Will

2)递归使用相同father_ofnms定义的替代方法在函数rec中使用递归。 nms的长度控制着代数,与以前一样。

rec <- function(x, n) if (ncol(x) == n) x else Recall(cbind(father_of(x[[1]]), x), n)

r <- rec(table1[, .(individual)], length(nms))
table1[, c(.(id = id), setNames(r, nms))]

,并提供:

   id greatgrandfather grandfather   father individual
1:  1               NA          NA       NA   Vladimir
2:  2               NA          NA Vladimir      Boris
3:  3               NA    Vladimir    Boris       John
4:  4         Vladimir       Boris     John       Will

更新已修复。添加了(2)。

答案 1 :(得分:1)

我不认为你的解决方案是那么难看。但也许你可以让重命名过程更加明确。以下是使用merge连接语法重写data.table的方法,它通过将两个连接链接在一起来保存一些变量赋值:

table1[, .(grandgrandfather = father, grandfather = individual)][
    table1[, .(grandfather = father, father = individual)][
        table1, on = .(father)
    ], 
    on = .(grandfather)
]

#   grandgrandfather grandfather   father id individual
#1:               NA          NA       NA  1   Vladimir
#2:               NA          NA Vladimir  2      Boris
#3:               NA    Vladimir    Boris  3       John
#4:         Vladimir       Boris     John  4       Will

如果您需要比通过手动编写连接更可靠的几代,您可以使用for循环进行递归连接:

find_ancestors <- function(table, n) {
    final <- copy(table)
    setnames(final, 'father', 'father_1')
    for (i in seq_len(n)) {
        name_up <- paste('father', i:(i+1), sep = "_")
        final <- table[, setNames(.(individual, father), name_up)][final, on = name_up[1]]
    }
    final
}

find_ancestors(table1, 3)
#   father_3 father_4 father_2 father_1 id individual
#1:       NA       NA       NA       NA  1   Vladimir
#2:       NA       NA       NA Vladimir  2      Boris
#3:       NA       NA Vladimir    Boris  3       John
#4: Vladimir       NA    Boris     John  4       Will

find_ancestors(table1, 5)
#   father_5 father_6 father_4 father_3 father_2 father_1 id individual
#1:       NA       NA       NA       NA       NA       NA  1   Vladimir
#2:       NA       NA       NA       NA       NA Vladimir  2      Boris
#3:       NA       NA       NA       NA Vladimir    Boris  3       John
#4:       NA       NA       NA Vladimir    Boris     John  4       Will

find_ancestors(table1, 2)
#   father_2 father_3 father_1 id individual
#1:       NA       NA       NA  1   Vladimir
#2:       NA       NA Vladimir  2      Boris
#3: Vladimir       NA    Boris  3       John
#4:    Boris Vladimir     John  4       Will