我的数据集如下:
time type amount
1 2017/1/1 0:00 income 729.64
2 2017/1/1 0:05 income 1465.15
3 2017/1/1 0:10 outcome 1456.07
4 2017/1/1 0:15 outcome 1764.28
...
289 2017/1/2 0:00 income 719.64
290 2017/1/2 0:05 income 165.15
291 2017/1/2 0:10 income 1006.07
292 2017/1/2 0:15 outcome 104.28
我想按日期计算净收入,如果您的收入超过结果,那么其他收益将为正数。 结果应如下所示:
date netincome
1 2017/1/1 -729.64
2 2017/1/2 1465.15
3 2017/1/3 1456.07
4 2017/1/4 1764.28
...
我怎样才能有效地获得这个?
答案 0 :(得分:2)
示例数据:
df <- data.frame(time=c("2017/1/1 0:00", "2017/1/1 0:05", "2017/1/1 0:10","2017/1/2 0:00", "2017/1/2 0:05", "2017/1/2 0:10"),
type=c("income", "income", "outcome", "income", "outcome", "outcome"),
amount=c(729.64, 1465.15, 1456.07, 729.64, 729.64, 1456.07))
将time
转换为date
,将outcome
转换为负值:
df$date <- lubridate::date(df$time)
df$amount[df$type=="outcome"] <- df$amount[df$type=="outcome"]*-1
使用dplyr
汇总数据(amount
与date
的总和):
library(dplyr)
output <- df %>% group_by(date) %>% summarise(netincome=sum(amount))
结果:
output
# A tibble: 2 x 2
date netincome
<chr> <dbl>
1 2017/1/1 738.72
2 2017/1/2 -1456.07
答案 1 :(得分:2)
其他解决办法可能是:
library(tidyverse)
library(lubridate)
df %>%
spread(type, amount) %>%
group_by(date = date(time)) %>%
summarise(netincome = sum(income, na.rm = TRUE) - sum(outcome, na.rm = TRUE))
# # A tibble: 2 x 2
# date netincome
# <date> <dbl>
# 1 2017-01-01 739
# 2 2017-01-02 -1456