根据同一数据框的两个以上列的值汇总一列的值

时间:2019-06-15 01:23:41

标签: r

我下面有数据框。

year<-c(2016,2016,2017,2017,2016,2016,2017,2017)
city<-c("NY","NY","NY","NY","WS","WS","WS","WS")
spec<-c("df","df","df","df","vb","vb","vb","vb")
num<-c(45,67,89,90,45,67,89,90)
df<-data.frame(year,city,spec,num)

我想知道是否有可能基于numyearcity列对spec求和,以便将其从这种形式中提取出来:< / p>

year city spec num
1 2016   NY   df  45
2 2016   NY   df  67
3 2017   NY   df  89
4 2017   NY   df  90
5 2016   WS   vb  45
6 2016   WS   vb  67
7 2017   WS   vb  89
8 2017   WS   vb  90

对此:

year city spec num
1 2016   NY   df 112
2 2017   NY   df 179
3 2016   WS   vb 112
4 2017   WS   vb 179

2 个答案:

答案 0 :(得分:1)

一种方法是使用sqldf软件包:

sqldf("Select year, city, spec, sum(num) from df 
      group by year, city, spec order by city")

  year city spec sum(num)
1 2016   NY   df      112
2 2017   NY   df      179
3 2016   WS   vb      112
4 2017   WS   vb      179

使用dplyr

df %>% 
  group_by(year, city, spec) %>% 
  summarise(SumNum = sum(num)) %>% 
  arrange(city)

答案 1 :(得分:1)

可能重复,但这是一个答案:

library(tidyverse)

df %>%
  group_by(year,city,spec) %>%
  summarise(sum = sum(num))

...导致...

# A tibble: 4 x 4
# Groups:   year, city [4]
   year city  spec    sum
  <dbl> <fct> <fct> <dbl>
1  2016 NY    df      112
2  2016 WS    vb      112
3  2017 NY    df      179
4  2017 WS    vb      179