我想汇总这个数据框,其中每个Family Size
有六个类别Hours Worked
。
families <- structure(list(`Family Size` = c(2L, 2L, 2L, 2L, 2L, 2L, 2L,13L, 13L, 13L), HoursLess20 = c("1,014", "1,041", "11", "3","1", "2", "1", "0", "0", "0"), Hours2024 = c(7L, 298L, 1L, 0L,0L, 0L, 0L, 0L, 0L, 0L), Hours2529 = c(1L, 34L, 0L, 0L, 0L, 0L,0L, 0L, 0L, 0L), Hours3034 = c(6L, 44L, 1L, 0L, 0L, 0L, 0L, 0L,0L, 0L), Hours3539 = c(4L, 46L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), Hours40plus = c(9L, 128L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("Family Size","HoursLess20", "Hours2024", "Hours2529", "Hours3034", "Hours3539","Hours40plus"), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 1977L,1978L, 1979L), class = "data.frame")
答案 0 :(得分:1)
首先,您目前将HoursLess20
中的值作为字符串
(因为逗号)。要进行任何类型的数字聚合,
你会想要删除逗号并将其转换为数字。
families$HoursLess20 = as.numeric(gsub(",", "", families$HoursLess20))
完成后,您只需使用聚合函数即可 创建你想要的任何聚合。
## Sum
aggregate(families[,-1], list(families[,1]), sum)
Group.1 HoursLess20 Hours2024 Hours2529 Hours3034 Hours3539 Hours40plus
1 2 2073 306 35 51 50 138
2 13 0 0 0 0 0 0
## Average
aggregate(families[,-1], list(families[,1]), mean)
Group.1 HoursLess20 Hours2024 Hours2529 Hours3034 Hours3539 Hours40plus
1 2 296.1429 43.71429 5 7.285714 7.142857 19.71429
2 13 0.0000 0.00000 0 0.000000 0.000000 0.00000