提前致谢。我有一个家庭成员的数据框架以及他们与家庭户主的关系,我想计算家庭结构的独特组合的数量。
我可以通过将数据转换为宽格式并使用ddply计数来实现这一点(可能以迂回的方式),但这并不能解释具有不同顺序的相同族结构。像这样:
familyMember <- c("son","son","Head of household","daughter","grandmother","Head of household","son",
"Head of household","son","son","daughter","grandmother","Head of household","son")
familyGroup <- c(1,1,1,2,2,2,2,3,3,3,4,4,4,4)
families <- data.frame(familyMember,familyGroup)
请注意,familyGroups&#39; 2&#39;和&#39; 4&#39;是完全相同的家庭结构在同一个顺序。请注意,familyGroups&#39; 1&#39;和&#39; 3&#39;是相同的家庭结构,但顺序不同。然后我使用dplyr创建一个索引,该索引是&#39;家庭成员的数量&#39;对于每个家庭组&#39;
familiesIndex <- ddply(families, .(familyGroup), mutate,
index = paste0('family', 1:length(familyGroup)))
接下来我重塑为宽格式:
familiesIndex_reshape <- reshape(familiesIndex, idvar = "familyGroup", timevar="index", direction = "wide")
最后,我使用count来获取唯一组合的数量:
familiesIndex_reshape_Unique <- count(familiesIndex_reshape,
familyMember.family1,
familyMember.family2,
familyMember.family3,
familyMember.family4) %>% ungroup()
这导致家庭组1和3的单独组。我希望尽管他们的顺序,这两个组被计为相同。非常感谢,再次。