按年份分组数据并汇总值

时间:2019-07-21 14:18:40

标签: r

sum_Coca<-DT[, list(Tot = sum(Cocacola$StockPrice)),by = year(date)]

transform(sum_Coca, new.col=c(NA,sum_Coca$StockPrice[-1]/sum_Coca$StockPrice[-nrow(sum_Coca)]-1))

我试图按年份对数据进行分组并对值求和,但无法获得总和,而是按分组的所有年份给出的总值相同

Row Labels  Sum of StockPrice
1970    217.9027648
1971    227.2976594
1972    276.40473
1973    229.0205211

我期望结果为

Row Labels  Sum of StockPrice
1970    950.625672
1971    950.625672
1972    950.625672
1973    950.625672

但获得的结果是

{{1}}

2 个答案:

答案 0 :(得分:1)

问题是我们要提取整个列,而不是提取与每个组有关的列的值。尚不清楚“ Cocacola”数据集是否与“ DT”相同。如果是这样,则删除“ Cocacola $”

library(data.table)
DT[, list(Tot = sum(StockPrice)),by = .(Year = year(date))]

答案 1 :(得分:1)

使用dplyrlubridate的解决方案如下所示:

library(dplyr)
library(lubridate)

df %>% 
  group_by(year(Date)) %>% 
  summarise(total_stock = sum(StockPrice))

  ## A tibble: 4 x 2
  # `year(Date)` total_stock
  #         <dbl>       <dbl>
  #1         1970        218.
  #2         1971        227.
  #3         1972        276.
  #4         1973        229.

或者您可以通过创建新的Year字段来清理标签:

df %>% 
  mutate(Year = year(Date)) %>% 
  group_by(Year) %>% 
  summarise(total_stock = sum(StockPrice))

## A tibble: 4 x 2
#   Year total_stock
#  <dbl>       <dbl>
#1  1970        218.
#2  1971        227.
#3  1972        276.
#4  1973        229.