我有一个数据框,其中包含来自gnucash mysql数据库的帐户的子字段和父字段。我想将帐户层次结构存储在数据框中。在过去,我在mySQL中使用了递归连接,但随着层次结构的深入,它变得很麻烦。你还必须知道你的树有多少级别。我希望在R中有一种更简单的方法来构建层次结构(有或没有最大深度的知识)。
示例数据:
account_id <- c(1:11)
account_name <- c('root_account','dining', 'food', 'discretionary_expense',
'expenses', 'base_salary_wife', 'base_salary_husband',
'base_salary', 'salary', 'taxable_income',
'income')
account_parentid <- c(NA,3,4,5,1,8,8,9,10,11,1)
test.data <- data.frame(account_id, account_name, account_parentid)
期望的输出:
account_id account_name account_parentid lvl2_parentid lvl3_parentid lvl4_parentid lvls
1 1 root_account NA NA NA NA NA
2 2 dining 3 4 6 NA 4
3 3 food 4 5 NA NA 3
4 4 discretionary_expense 5 NA NA NA 2
5 5 expenses 1 NA NA NA 1
6 6 base_salary_wife 8 9 10 11 5
7 7 base_salary_husband 8 9 10 11 5
8 8 base_salary 9 10 11 NA 4
9 9 salary 10 11 NA NA 3
10 10 taxable_income 11 NA NA NA 2
11 11 income 1 NA NA NA 1
答案 0 :(得分:3)
您可以使用data.tree包来处理分层数据:
获取测试数据:
account_id <- c(1:11)
account_name <- c('root_account','dining', 'food', 'discretionary_expense',
'expenses', 'base_salary_wife', 'base_salary_husband',
'base_salary', 'salary', 'taxable_income',
'income')
account_parentid <- c(NA,3,4,5,1,8,8,9,10,11,1)
test.data <- data.frame(account_id, account_parentid, account_name, stringsAsFactors = F)
转换为data.tree结构:
library(data.tree)
tree1 <- FromDataFrameNetwork(test.data[-1,])
tree1$account_name <- 'root_account'
显示:
ToDataFrameTree(tree1, account = 'name', 'account_name', 'pathString')
这将显示如下:
levelName account account_name pathString
1 1 1 root_account 1
2 ¦--5 5 expenses 1/5
3 ¦ °--4 4 discretionary_expense 1/5/4
4 ¦ °--3 3 food 1/5/4/3
5 ¦ °--2 2 dining 1/5/4/3/2
6 °--11 11 income 1/11
7 °--10 10 taxable_income 1/11/10
8 °--9 9 salary 1/11/10/9
9 °--8 8 base_salary 1/11/10/9/8
10 ¦--6 6 base_salary_wife 1/11/10/9/8/6
11 °--7 7 base_salary_husband 1/11/10/9/8/7
不是问题的一部分,但真正有趣的地方在于您想要总结层次结构等。请参阅data.tree vignettes here和here。