如何累加几列并求和?

时间:2019-06-25 01:53:03

标签: r addition

我想合并15列,这些列具有3个相同的列(因此它具有5个相同的副本)。我的数据看起来像这样(例如,为了简单起见,它只有3个副本)

request.node

要这样

   date     sku1  prod1  tot1  sku2  prod2  tot2  sku3  prod3  tot3
01/02/2019  100     a    100
01/02/2019  100     a    200    101    b     50
02/02/2019  101     b    100
02/02/2019  101     b     50    102    c    100   100     a     50
02/02/2019  102     c     50

有人知道该怎么做吗?非常感谢

2 个答案:

答案 0 :(得分:0)

一个选项是melt中的data.table,可能需要多个measure patterns

library(data.table)
melt(setDT(df1), measure = patterns("^prod", "^tot"), na.rm = TRUE, 
    value.name = c( "all_prod", "total"))[, c(list(sku = first(sku1)), 
    lapply(.SD, sum, na.rm = TRUE)), .(date, all_prod),
          .SDcols = c("total")][order(date)]
#        date all_prod sku total
#1: 01/02/2019        a 100   300
#2: 01/02/2019        b 100    50
#3: 02/02/2019        b 101   150
#4: 02/02/2019        c 102   150
#5: 02/02/2019        a 101    50

数据

df1 <- structure(list(date = structure(c(1L, 1L, 2L, 2L, 2L), .Label = 
 c("01/02/2019", "02/02/2019"), class = "factor"), sku1 = c(100, 100, 101, 101, 
 102), prod1 = structure(c(1L, 1L, 2L, 2L, 3L), .Label = c("a", 
 "b", "c"), class = "factor"), tot1 = c(100, 200, 100, 50, 50), 
 sku2 = c(NA, 101, NA, 102, NA), prod2 = structure(c(NA, 1L, 
 NA, 2L, NA), .Label = c("b", "c"), class = "factor"), tot2 = c(NA, 
 50, NA, 100, NA), sku3 = c(NA, NA, NA, 100, NA), prod3 = 
 structure(c(NA, NA, NA, 1L, NA), .Label = "a", class = "factor"), tot3 = c(NA, 
 NA, NA, 50, NA)), row.names = c(NA, -5L), class = "data.frame")

答案 1 :(得分:0)

使用dplyrtidyr,我们可以将数据gather转换为长格式,从列名中删除数字,spread将其转换为宽格式,{{1} } group_bydate的值,并在每个组中获取prod中的sum个值。

tot

数据

library(dplyr)
library(tidyr)

df %>%
  gather(key, value, -date, na.rm = TRUE) %>%
  mutate(key = sub("(.*)\\d+", "\\1", key)) %>%
  group_by(key) %>%
  mutate(row = row_number()) %>%
  spread(key, value) %>%
  mutate_at(vars(sku, tot), as.numeric) %>%
  group_by(date, prod) %>%
  summarise(sku = sku[1L], 
            tot = sum(tot))

#  date       prod    sku   tot
#  <fct>      <chr> <dbl> <dbl>
#1 01/02/2019 a       100   300
#2 01/02/2019 b       101    50
#3 02/02/2019 a       100    50
#4 02/02/2019 b       101   150
#5 02/02/2019 c       102   150