如何在父母 - 儿童餐桌上添加家庭号码

时间:2018-04-05 10:54:09

标签: r recursion tree dplyr tidyverse

我的数据: 有关2个家庭的信息:" Guillou"和" Cleach" 。 人名(人),父亲姓(父亲)和家庭中的等级(等级)。我现实的名字很重要。我使用数字id来避免谐音的问题。             人父亲级别             Guillou Arthur NA 1             Cleach Marc NA 1             Guillou Eric Guillou Arthur 2             Guillou Jacques Guillou Arthur 2             Cleach Franck Cleach Marc 2             Cleach Leo Cleach Marc 2             Cleach Herbet Cleach Leo 3             Cleach Adele Cleach Herbet 4             Guillou Jean Guillou Eric 3             Guillou Alan Guillou Eric 3 此数据框基于@Moody_Mudskipper答案(此帖子中关于家谱中级别的上一个问题 这里'是返回表格的说明: 数据: 人< - c(" Guillou Arthur",           " Cleach Marc",           " Guillou Eric",           " Guillou Jacques",           " Cleach Franck",           " Cleach Leo",           " Cleach Herbet",           " Cleach Adele",           " Guillou Jean",           " Guillou Alan" ) 父亲< - c(NA,NA," Guillou Arthur"," Guillou Arthur"," Cleach Marc"," Cleach Marc", " Cleach Leo"," Cleach Herbet"," Guillou Eric"," Guillou Eric")  family< - data.frame(person,father,stringsAsFactors = FALSE) 递归函数:  father_line< - function(x){  爸爸&l​​t; - subset(family,person == x)$ father  if(is.na(dad))return(x)  C(X,father_line(DAD))  } 功能输出示例:   father_line(" Guillou Alan")  " Guillou Alan" " Guillou Eric" " Guillou Arthur" 表:  库(tidyverse)  家庭%>%  mutate(family_line = map(person,father_line),      level = lengths(family_line),      patriarch = map(family_line,last))%>%   选择(人,父亲,层次) 我的问题 : 如何根据人/父关系区分这两个家庭?考虑到我不能使用家族的名字:在我的可再生的例子中,家庭有不同的名字,但实际上不是 预期产量:             人父亲级别的家庭             Guillou Arthur NA 1 1             Cleach Marc NA 1 2             Guillou Eric Guillou Arthur 2 1             Guillou Jacques Guillou Arthur 2 1             Cleach Franck Cleach Marc 2 2             Cleach Leo Cleach Marc 2 2             Cleach Herbet Cleach Leo 3 2             Cleach Adele Cleach Herbet 4 2             Guillou Jean Guillou Eric 3 1             Guillou Alan Guillou Eric 3 1 有了ids     #data     人< - c(1,2,3,4,5,6,7,8,9,10)     父亲< - c(NA,NA,1,1,2,2,6,7,3,3)      家庭< - data.frame(人,父亲)      #function      father_line< - function(x){      爸爸&l​​t; - subset(family,person == x)$ father      if(is.na(dad))return(x)      C(X,father_line(DAD))      }      库(tidyverse)      家庭%>%      mutate(family_line = map(person,father_line),          level = lengths(family_line),          patriarch = map(family_line,last))%>%       选择(人,父亲,层次)

1 个答案:

答案 0 :(得分:4)

您可能需要查看包igraph

在使用之前,您需要更改NA,我假设您不能让来自同一家庭的2个人使用NA。 所以:

roots <- family[is.na(family)] <- seq(sum(is.na(family)))

然后你创建一个图形(具有不同的连接),第一列需要是父亲:

library(igraph)
family_tree <- graph_from_data_frame(family[, 2:1])

你可以看到它:

plot(family_tree)

enter image description here

然后,您可以使用distances计算级别和系列到root:

tab_roots <- sapply(roots, function(root) distances(family_tree, family$person, root))

你必须要家庭data.frame:

family$level <- apply(tab_roots, 1, min)
family$family <- apply(tab_roots, 1, function(d) which(d!=Inf))
family
#            person         father level family
#1   Guillou Arthur              1     1      1
#2      Cleach Marc              2     1      2
#3     Guillou Eric Guillou Arthur     2      1
#4  Guillou Jacques Guillou Arthur     2      1
#5    Cleach Franck    Cleach Marc     2      2
#6       Cleach Leo    Cleach Marc     2      2
#7    Cleach Herbet     Cleach Leo     3      2
#8     Cleach Adele  Cleach Herbet     4      2
#9     Guillou Jean   Guillou Eric     3      1
#10    Guillou Alan   Guillou Eric     3      1