根据年份和R中的ID进行汇总

时间:2018-07-30 09:30:47

标签: r sum aggregate

我想按yearID汇总我的费用。

以下是一些示例数据:

   ID <- c(1,1,1,1,2,2,3,3)
   year <- c(1,2,2,2,3,3,3,3)
   cost <- c(1,1,2,3,2,2,2,2)

   data = cbind(ID, year, cost)

此信息应保存在其他列中,以便costs_year1的{​​{1}},costs_year2costs_year3处。然后,我将删除其他列并删除重复的ID,以便获得宽数据框。

有什么建议可以做到这一点?

2 个答案:

答案 0 :(得分:2)

使用tidyverse

library(tidyverse)
ID <- c(1,1,1,1,2,2,3,3)
year <- c(1,2,2,2,3,3,3,3)
cost <- c(1,1,2,3,2,2,2,2)

data = data.frame(ID, year, cost)

data %>%
  mutate(year = paste0("costs_year",year)) %>%
  group_by(year,ID) %>%
  summarize_at("cost",sum) %>%
  spread(year,cost)

# # A tibble: 3 x 4
#      ID costs_year1 costs_year2 costs_year3
# * <dbl>       <dbl>       <dbl>       <dbl>
# 1     1           1           6          NA
# 2     2          NA          NA           4
# 3     3          NA          NA           4

%>%被称为管道运算符,它来自程序包magrittr,例如,在将tidyverselibrary(tidyverse)连接起来之后,您可以使用它。

使用管道,您可以将上一条指令的输出用作下一个调用的第一个参数,但是示例将更好地教您。这是在没有管道的情况下使其运行的方法:

x <- mutate(data, year = paste0("costs_year",year))
x <- group_by(x,year,ID)
x <- summarize_at(x,"cost",sum)
spread(x,year,cost)

有关更多信息:What does %>% mean in R

答案 1 :(得分:0)

使用openapi: 3.0.0 info: description: "This is a sample server Petstore server. You can find out more about Swagger at [http://swagger.io](http://swagger.io) or on [irc.freenode.net, #swagger](http://swagger.io/irc/). For this sample, you can use the api key `special-key` to test the authorization filters." version: "1.0.0" title: "Swagger Petstore" termsOfService: "http://swagger.io/terms/" contact: email: "apiteam@swagger.io" license: name: "Apache 2.0" url: "http://www.apache.org/licenses/LICENSE-2.0.html" paths: /products: get: parameters: - in: query name: color required: true schema: $ref: '#/components/schemas/Color' responses: '200': description: OK /products2: get: parameters: - in: query name: color required: true schema: $ref: '#/components/schemas/Color' responses: '200': description: OK components: schemas: Color: type: string enum: - black - white - red - green - blue 软件包中的dcast()

reshape2

或一步:

library(reshape2)
df.wide <- dcast(df1, ID ~ year, sum)
names(df.wide) <- c("ID", paste0("costs.year.", 1:3))

屈服

df.wide <- setNames(dcast(df1, ID ~ year, sum), c("ID", paste0("costs.year.", 1:3)))

数据

> df.wide
  ID costs.year.1 costs.year.2 costs.year.3
1  1            1            6            0
2  2            0            0            4
3  3            0            0            4