我想添加一列的值,将它们分组为两列。我发现如何在一列上执行此操作,但无法弄清楚如何在两列上执行此操作。 例如,如果我有以下数据框:
x=c("a","a", "b", "b","c", "c","a","a","b","b","c","c", "a", "a","b","b", "c", "c")
y=c(1:18)
q=c("M","M","M", "M","M","M","W","W","W","W","W","W","F","F","F","F","F","F")
df<-data.frame(x,y,q)
我想在x和q的y列中添加值,以便我有一个像这样的新数据框
x=c("a","a", "b", "b","c", "c","a","a","b","b","c","c", "a", "a","b","b", "c", "c")
y=c(3,7,11,15,19,23,27,31,35)
q=c("M","M","M","W","W","W","F","F","F")
d<-data.frame(x,y,q)
答案 0 :(得分:4)
您有几种选择:
1:基地R
aggregate(y~x+q, df, sum)
2: data.table
library(data.table)
setDT(df)[, .(sumy=sum(y)), by = .(x,q)]
# when you want to summarise several columns:
setDT(df)[, lapply(.SD, sum), by = .(x,q)]
3: dplyr
library(dplyr)
df %>% group_by(x,q) %>% summarise(sumy = sum(y))
# when you want to summarise several columns:
df %>% group_by(x,q) %>% summarise_each(funs(sum))
所有人都应该给你相同的结果(虽然不是相同的顺序)。例如,data.table
输出如下所示:
x q y
1: a M 3
2: b M 7
3: c M 11
4: a W 15
5: b W 19
6: c W 23
7: a F 27
8: b F 31
9: c F 35