我有一个跟踪某些贷款余额的数据框。每次向余额付款(“金额”),该资产贷款的新余额就会显示在“余额”列中。
df = data.frame(Date = c("2015-03-01", "2015-05-01", "2016-07-02", "2017-11-24", "2017-12-15"),
Property = c("1 Main St", "1 Main St", "1 Main St", "5 Main St", "1 Main St"),
Amount = c(50000, -10000, -5000, 75000, -4000),
Balance = c(50000, 40000, 35000, 75000, 31000)
)
如您所见,日期相当分散,大多数月份都没有任何交易记录。我希望能够制作一个在每个月初具有每个属性余额的数据框,而不管该月是否有交易。像这样:
Month = c("March 2015", "April 2015", "May 2015", "June 2015"),
Property = c("1 Main St", "1 Main St", "1 Main St", "1 Main St").
Balance = c(50000, 50000, 40000, 40000)
它还需要能够处理当月的最后一笔交易(如果在给定的月份内某物业的交易不止一次)。有什么想法如何处理吗?
答案 0 :(得分:0)
首先,请确保您的Date
字段的类型为“日期”。这是我用来处理数据的电话:
df = data.frame(Date = as.Date(c("2015-03-01", "2015-05-01", "2016-07-02", "2017-11-24", "2017-12-15"), "%Y-%m-%d"),
Property = c("1 Main St", "1 Main St", "1 Main St", "5 Main St", "1 Main St"),
Amount = c(50000, -10000, -5000, 75000, -4000),
Balance = c(50000, 40000, 35000, 75000, 31000),
stringsAsFactors = FALSE
)
注意,我还向stringsAsFactors = FALSE
调用中添加了data.frame
参数。
然后,我使用以下代码来也许(?)回答您的问题:
library(tidyr)
library(dplyr)
library(lubridate)
arrange(df, Date)
from <- first(df$Date)
to <- last(df$Date)
new_df <- df %>%
complete(Date = seq.Date(from, to, "day"))%>%
fill(Property:Balance)%>%
group_by(year = year(Date), month=month(Date, label = TRUE), Property)%>%
summarise(Balance = last(Balance))