我有以下数据框:
select *
from (
SELECT DISTINCT YEAR(CreatedDate) as FY
from MyTable
union
SELECT max(YEAR(CreatedDate))+1 as FY
from MyTable
)x
ORDER BY FY ASC
我需要在以下数据框中重新分组年龄类别:
gender age population
H 0-4 5
H 5-9 5
H 10-14 10
H 15-19 15
H 20-24 15
H 25-29 10
M 0-4 0
M 5-9 5
M 10-14 5
M 15-19 15
M 20-24 10
M 25-29 15
我更喜欢dplyr,所以如果有办法用这个包完成这个,我很感激。
答案 0 :(得分:7)
使用字符串拆分 - write
和tidyr::separate()
:
cut()
答案 1 :(得分:0)
data.table
解决方案,其中dat
是表格:
library(data.table)
dat <- as.data.table(dat)
dat[ , mn := as.numeric(sapply(strsplit(age, "-"), "[[", 1))]
dat[ , age := cut(mn, breaks = c(0, 14, 19, 29),
include.lowest = TRUE,
labels = c("0-14", "15-19", "20-29"))]
dat[ , list(population = sum(population)), by = list(gender, age)]
# gender age population
# 1: H 0-14 20
# 2: H 15-19 15
# 3: H 20-29 25
# 4: M 0-14 10
# 5: M 15-19 15
# 6: M 20-29 25