Using setkey() to summarize data in R

时间:2015-12-14 18:07:51

标签: r summary

I'm looking to summarize data in two ways, but in one line of code. First, I want to get the mean/median/SD of a numeric variable by ID. Easy enough. I also want to get the mean/median/SD of a that same numeric variably by ID, but only for a subset of another variable. For example, I want to get the mean/median/SD of age by group if education is equal to 1.

Here's what I'm working with now:

DF.datatable<-data.table(DF)
setkey(DF.datatable, group)
new<-(DF.datatable[,list(mean=mean(age),median=median(age), sd=sd(age)),by=group])

As you can see, what I'm missing is the second component of the above. Setkey() creates a new file that only includes one row per group, so it's critical (and easier) that everything go in one code.

Any ideas?

1 个答案:

答案 0 :(得分:0)

试试这个:

DF.datatable[, .(mean(age), mean(ifelse(education == 1, age, NA), na.rm = T)), by = group]