我下面有数据框。
year<-c(2016,2016,2017,2017,2016,2016,2017,2017)
city<-c("NY","NY","NY","NY","WS","WS","WS","WS")
spec<-c("df","df","df","df","vb","vb","vb","vb")
num<-c(45,67,89,90,45,67,89,90)
df<-data.frame(year,city,spec,num)
我想知道是否有可能基于num
,year
和city
列对spec
求和,以便将其从这种形式中提取出来:< / p>
year city spec num
1 2016 NY df 45
2 2016 NY df 67
3 2017 NY df 89
4 2017 NY df 90
5 2016 WS vb 45
6 2016 WS vb 67
7 2017 WS vb 89
8 2017 WS vb 90
对此:
year city spec num
1 2016 NY df 112
2 2017 NY df 179
3 2016 WS vb 112
4 2017 WS vb 179
答案 0 :(得分:1)
一种方法是使用sqldf
软件包:
sqldf("Select year, city, spec, sum(num) from df
group by year, city, spec order by city")
year city spec sum(num)
1 2016 NY df 112
2 2017 NY df 179
3 2016 WS vb 112
4 2017 WS vb 179
使用dplyr
df %>%
group_by(year, city, spec) %>%
summarise(SumNum = sum(num)) %>%
arrange(city)
答案 1 :(得分:1)
可能重复,但这是一个答案:
library(tidyverse)
df %>%
group_by(year,city,spec) %>%
summarise(sum = sum(num))
...导致...
# A tibble: 4 x 4
# Groups: year, city [4]
year city spec sum
<dbl> <fct> <fct> <dbl>
1 2016 NY df 112
2 2016 WS vb 112
3 2017 NY df 179
4 2017 WS vb 179