如何优雅地将table1(具有垂直关系)传递给table2(具有水平关系)?
library(data.table)
# vertical relationship
table1 <- data.table(id=1:4,father=c(NA,"Vladimir","Boris","John"),individual=c("Vladimir","Boris","John","Will"))
table1
# horizontal relationship
table2 <- data.table(id=1:4,greatgrandfather= c(NA,NA,NA,"Vladimir"), grandfather=c(NA,NA,"Vladimir","Boris"),father=c(NA,"Vladimir","Boris","John"),individual=c("Vladimir","Boris","John","Will"))
table2
这就是我提出的丑陋解决方案:
# ugly solution
temporary.table <- table1[,.(father,individual)]
names(temporary.table)<- c("grandfather","father")
new.table <- merge(table1,temporary.table,by="father",all.x=T)
names(temporary.table)<- c("greatgrandfather","grandfather")
new.table <- merge(new.table,temporary.table,by="grandfather",all.x=T)
ugly.solution <- new.table[order(id)][,.(id,greatgrandfather,grandfather,father,individual)]
ugly.solution
答案 0 :(得分:2)
1)Reduce 定义查找其参数父亲的father_of
函数。还要定义nms
,即输出的列名称("id"
除外)。
然后使用Reduce
递归应用father_of
。
最后将所有内容放在一个数据表中。
请注意,只需修改nms
,我们就可以在结果中拥有更多或更少的祖先。
father_of <- function(x, ...) table1[, father[match(x, individual)] ]
nms <- c("greatgrandfather", "grandfather", "father", "individual")
r <- Reduce(father_of, init = table1$individual, nms[-1], acc = TRUE)
table1[, c(.(id = id), setNames(rev(r), nms))]
,并提供:
id greatgrandfather grandfather father individual
1: 1 NA NA NA Vladimir
2: 2 NA NA Vladimir Boris
3: 3 NA Vladimir Boris John
4: 4 Vladimir Boris John Will
2)递归使用相同father_of
和nms
定义的替代方法在函数rec
中使用递归。 nms
的长度控制着代数,与以前一样。
rec <- function(x, n) if (ncol(x) == n) x else Recall(cbind(father_of(x[[1]]), x), n)
r <- rec(table1[, .(individual)], length(nms))
table1[, c(.(id = id), setNames(r, nms))]
,并提供:
id greatgrandfather grandfather father individual
1: 1 NA NA NA Vladimir
2: 2 NA NA Vladimir Boris
3: 3 NA Vladimir Boris John
4: 4 Vladimir Boris John Will
更新已修复。添加了(2)。
答案 1 :(得分:1)
我不认为你的解决方案是那么难看。但也许你可以让重命名过程更加明确。以下是使用merge
连接语法重写data.table
的方法,它通过将两个连接链接在一起来保存一些变量赋值:
table1[, .(grandgrandfather = father, grandfather = individual)][
table1[, .(grandfather = father, father = individual)][
table1, on = .(father)
],
on = .(grandfather)
]
# grandgrandfather grandfather father id individual
#1: NA NA NA 1 Vladimir
#2: NA NA Vladimir 2 Boris
#3: NA Vladimir Boris 3 John
#4: Vladimir Boris John 4 Will
如果您需要比通过手动编写连接更可靠的几代,您可以使用for循环进行递归连接:
find_ancestors <- function(table, n) {
final <- copy(table)
setnames(final, 'father', 'father_1')
for (i in seq_len(n)) {
name_up <- paste('father', i:(i+1), sep = "_")
final <- table[, setNames(.(individual, father), name_up)][final, on = name_up[1]]
}
final
}
find_ancestors(table1, 3)
# father_3 father_4 father_2 father_1 id individual
#1: NA NA NA NA 1 Vladimir
#2: NA NA NA Vladimir 2 Boris
#3: NA NA Vladimir Boris 3 John
#4: Vladimir NA Boris John 4 Will
find_ancestors(table1, 5)
# father_5 father_6 father_4 father_3 father_2 father_1 id individual
#1: NA NA NA NA NA NA 1 Vladimir
#2: NA NA NA NA NA Vladimir 2 Boris
#3: NA NA NA NA Vladimir Boris 3 John
#4: NA NA NA Vladimir Boris John 4 Will
find_ancestors(table1, 2)
# father_2 father_3 father_1 id individual
#1: NA NA NA 1 Vladimir
#2: NA NA Vladimir 2 Boris
#3: Vladimir NA Boris 3 John
#4: Boris Vladimir John 4 Will