我正在研究家庭树木:
我根据sqldf https://www.r-bloggers.com/exploring-recursive-ctes-with-sqldf/
改编了Bob Horton的例子我的数据:
POST to http://my-api/class/learn-rest/enrollment
我的结果,后代按“Guillou Arthur”(没有父亲的顶级人物)的等级排序:
person father
Guillou Arthur NA
Cleach Marc NA
Guillou Eric Guillou Arthur
Guillou Jacques Guillou Arthur
Cleach Franck Cleach Marc
Cleach Leo Cleach Marc
Cleach Herbet Cleach Leo
Cleach Adele Cleach Herbet
Guillou Jean Guillou Eric
Guillou Alan Guillou Eric
您可以使用sqldf:
进行递归查询来构建此表数据:
name parent_name level
Guillou Arthur NA 1
Guillou Eric Guillou Arthur 2
Guillou Jacques Guillou Arthur 2
Guillou Alan Guillou Eric 3
Guillou Jean Guillou Eric 3
大到长格式转换:
person <- c("Guillou Arthur",
"Cleach Marc",
"Guillou Eric",
"Guillou Jacques",
"Cleach Franck",
"Cleach Leo",
"Cleach Herbet",
"Cleach Adele",
"Guillou Jean",
"Guillou Alan" )
father <- c(NA, NA, "Guillou Arthur" , "Guillou Arthur", "Cleach Marc", "Cleach Marc", "Cleach Leo", "Cleach Herbet", "Guillou Eric", "Guillou Eric")
family <- data.frame(person, father)
递归查询以查找“Guillou Arthur”的后代(没有父亲的顶级人物):
library(tidyr)
long_family <- gather(family, parent, parent_name, -person)
long_family
我的问题:
如何使用R(而不是sql)直接创建包含所有族树的data.frame对象。
每棵树都以一个族长(没有父亲)开始,如“Cleach Marc”。 (使用R方法或sqldf方法)
答案 0 :(得分:2)
我们构建一个递归函数来获取父行,从那里一切都很容易。
首先,我们使用stringsAsFactors = FALSE
定义数据,以便更顺畅地重新格式化。
family <- data.frame(person, father,stringsAsFactors = FALSE)
功能
father_line <- function(x){
dad <- subset(family,person==x)$father
if(is.na(dad)) return(x)
c(x,father_line(dad))
}
father_line ("Guillou Alan")
# [1] "Guillou Alan" "Guillou Eric" "Guillou Arthur"
使用它来获取关卡和其他内容
family$father_line <- lapply(family$person,father_line)
family$level <- lengths(family$father_line)
family$patriarch <- sapply(family$father_line,tail,1)
# person father father_line level patriarch
# 1 Guillou Arthur <NA> Guillou Arthur 1 Guillou Arthur
# 2 Cleach Marc <NA> Cleach Marc 1 Cleach Marc
# 3 Guillou Eric Guillou Arthur Guillou Eric, Guillou Arthur 2 Guillou Arthur
# 4 Guillou Jacques Guillou Arthur Guillou Jacques, Guillou Arthur 2 Guillou Arthur
# 5 Cleach Franck Cleach Marc Cleach Franck, Cleach Marc 2 Cleach Marc
# 6 Cleach Leo Cleach Marc Cleach Leo, Cleach Marc 2 Cleach Marc
# 7 Cleach Herbet Cleach Leo Cleach Herbet, Cleach Leo, Cleach Marc 3 Cleach Marc
# 8 Cleach Adele Cleach Herbet Cleach Adele, Cleach Herbet, Cleach Leo, Cleach Marc 4 Cleach Marc
# 9 Guillou Jean Guillou Eric Guillou Jean, Guillou Eric, Guillou Arthur 3 Guillou Arthur
# 10 Guillou Alan Guillou Eric Guillou Alan, Guillou Eric, Guillou Arthur 3 Guillou Arthur
例如,要获得规定的预期输出:
subset(family,patriarch == "Guillou Arthur",select=c(person,father,level))
# person father level
# 1 Guillou Arthur <NA> 1
# 3 Guillou Eric Guillou Arthur 2
# 4 Guillou Jacques Guillou Arthur 2
# 9 Guillou Jean Guillou Eric 3
# 10 Guillou Alan Guillou Eric 3
tidyverse
方式如下:
library(tidyverse)
family %>%
mutate(family_line = map(person,father_line),
level = lengths(family_line),
patriarch = map(family_line,last)) %>%
filter(patriarch == "Guillou Arthur") %>%
select(person,father,level)
# person father level
# 1 Guillou Arthur <NA> 1
# 2 Guillou Eric Guillou Arthur 2
# 3 Guillou Jacques Guillou Arthur 2
# 4 Guillou Jean Guillou Eric 3
# 5 Guillou Alan Guillou Eric 3
答案 1 :(得分:1)
您可以使用图表工具执行此操作。所以使用igraph
,您可以使用ego
函数来获取邻居。
快速草图(需要检查!)
library(igraph)
family[] = lapply(family, factor, levels=unique(unlist(family)))
g = graph_from_adjacency_matrix(table(family))
cg = connect.neighborhood(g, order=length(V(g)), mode="out")
cbind( V(cg)$name,
sapply(ego(g, mode="out", mindist=1), function(x) replace(names(x), length(names(x))==0, NA)),
ego_size(cg, mode="out") )[grep("Guillou", V(cg)$name),]
[,1] [,2] [,3]
[1,] "Guillou Arthur" NA "1"
[2,] "Guillou Eric" "Guillou Arthur" "2"
[3,] "Guillou Jacques" "Guillou Arthur" "2"
[4,] "Guillou Jean" "Guillou Eric" "3"
[5,] "Guillou Alan" "Guillou Eric" "3"
事实上,你可能不需要创建一个邻居图,可以使用:
cbind( V(g)$name,
sapply(ego(g, mode="out", mindist=1), function(x) replace(names(x), length(names(x))==0, NA)),
ego_size(g, mode="out", order=length(V(g))) )[grep("Cleach", V(g)$name),]