我有以下data.frame
:
x <- data.frame(product = c(1,1,1,2,2,2),
value_day = c(0,0,150.23,110.98,18.65,0),
my_date = c("2019-01-01","2019-01-02","2019-01-03","2019-01-01","2019-01-02","2019-01-03"))
我的分析如下:
library(tidyverse)
x %>%
group_by(product) %>%
mutate(total = sum(value_day)) %>%
arrange(my_date) %>%
slice(c(1, n())) %>%
spread(key = my_date, value = total) %>%
filter(value_day > 0)
# A tibble: 2 x 4
# Groups: product [2]
product value_day `2019-01-01` `2019-01-03`
<dbl> <dbl> <dbl> <dbl>
1 1 150. NA 150.
2 2 111. 130. NA
但是,我想把日期设为例。像这样:
product total first_sell last_sell
1 1 150.23 2019-01-03 2019-01-03
2 2 129.63 2019-01-01 2019-01-02
答案 0 :(得分:2)
您可以执行以下操作:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
x <- tibble(product = c(1, 1, 1, 2, 2, 2),
value_day = c(0, 0, 150.23, 110.98, 18.65, 0),
my_date = c("2019-01-01", "2019-01-02", "2019-01-03", "2019-01-01", "2019-01-02", "2019-01-03"))
x %>%
group_by(product) %>%
summarise(total = sum(value_day),
first_sell = first(my_date),
last_sell = last(my_date))
#> # A tibble: 2 x 4
#> product total first_sell last_sell
#> <dbl> <dbl> <chr> <chr>
#> 1 1 150. 2019-01-01 2019-01-03
#> 2 2 130. 2019-01-01 2019-01-03
由reprex package(v0.3.0)于2019-05-19创建
答案 1 :(得分:1)
library(dplyr)
group_by(x,product) %>% mutate(total = sum(value_day),
first_sell = first(my_date),
last_sell = last(my_date)) %>%
select(-value_day, -my_date) %>%
distinct()
## A tibble: 2 x 4
## Groups: product [2]
## product total first_sell last_sell
## <dbl> <dbl> <fct> <fct>
## 1 1 150. 2019-01-01 2019-01-03
## 2 2 130. 2019-01-01 2019-01-03