我有以下数据框,例如
dummy_ts_1 <- data.frame(Date=as.Date(c("1990-03-31","1990-06-30","1990-09-30","1990-12-31","1991-03-31","1991-06-30","1991-09-30","1991-12-31","1992-03-31","1992-06-30")),
GDP=c(100,200,300,400,500,600,700,800,900,1000))
dummy_ts_2 <- data.frame(Date=as.Date(c("1980-01-31","1980-04-30","1980-07-31","1980-10-31","1981-01-31","1981-04-30","1981-07-31","1981-10-31","1982-01-31","1982-04-30")),
GDP=c(150,160,250,247,300,400,500,600,700,1000))
我需要在同一季度内填写前几个月(data.table :: quarter(dummy_ts_1)),以便所需的输出看起来像
> dummy_ts_1
Date GDP
1990-01-31 33.33333
1990-02-31 33.33333
1990-03-31 33.33333
1990-04-30 66.66667
1990-05-30 66.66667
1990-06-30 66.66667
1990-07-30 100
1990-08-30 100
1990-09-30 100
1990-10-31 133.3333
1990-11-31 133.3333
1990-12-31 133.3333
有没有简单的方法,如何实现所需的输出?感谢您的任何建议。
答案 0 :(得分:0)
不确定这是一个很好的方式,但这是一个解决方案:
require(lubridate)
#dummy DF with date and GDP
dummy_ts_1 <- data.frame(Date=as.Date(c("1990-03-31","1990-06-30","1990-09-30","1990-12-31")), GDP=c(100,200,300,400))
dummy_ts_1$new_gdp <- dummy_ts_1$GDP / 3 #store monthly GDP values
dummy_ts_1$quarter <- quarter(dummy_ts_1$Date) #store quarter
#create month-end date sequence
start <- as.Date("1990-01-01")
end <- as.Date("1990-12-01")
months <- seq.Date(start, end, by = 'month')
new_months <- c()
for(i in 1:length(months))
{
new_months[i] <- ceiling_date(months[i], "month") - days(1)
}
#store in new data frame
new_df <- data.frame(months = as.Date(new_months, origin = '1970-01-01'))
new_df$quarter <- quarter(new_df$months)
#merge with dummy_ts_1 to get final results
new_df <- merge(new_df, subset(dummy_ts_1, select = c('quarter', 'new_gdp')), by = 'quarter', all.x = TRUE)
产生:
> new_df
quarter months new_gdp
1 1 1990-01-31 33.33333
2 1 1990-02-28 33.33333
3 1 1990-03-31 33.33333
4 2 1990-04-30 66.66667
5 2 1990-05-31 66.66667
6 2 1990-06-30 66.66667
7 3 1990-07-31 100.00000
8 3 1990-08-31 100.00000
9 3 1990-09-30 100.00000
10 4 1990-10-31 133.33333
11 4 1990-11-30 133.33333
12 4 1990-12-31 133.33333
答案 1 :(得分:0)
年级课程直接表示没有白天的年份和月份,因此您不必强制将其纳入日期。将系列转换为zoo类,将时间索引转换为yearmon,同时给出zq。然后得到值rng
的范围 - 一对给出第一个和最后一个时间值。从那里我们可以创建时间值g
的序列或网格,并将其与zq合并以给出zm。然后可以使用na.locf
填写提供zm
的NA值。我们可以把它留作动物园系列,以便我们可以使用动物园的所有其他设施,但如果你想把它变回数据框,请使用fortify.zoo
。
library(zoo)
zq <- read.zoo(dummy_ts_1, FUN = as.yearmon)
rng <- range(time(zq))
g <- as.yearmon(seq(rng[1] - 2/12, rng[2], by = 1/12))
zm <- na.locf(merge(zq, zoo(, g)), fromLast = TRUE) / 3
DF <- fortify.zoo(zm) # optional
,并提供:
> DF
Index zm
1 Jan 1990 33.33333
2 Feb 1990 33.33333
3 Mar 1990 33.33333
4 Apr 1990 66.66667
5 May 1990 66.66667
6 Jun 1990 66.66667
7 Jul 1990 100.00000
8 Aug 1990 100.00000
9 Sep 1990 100.00000
10 Oct 1990 133.33333
11 Nov 1990 133.33333
12 Dec 1990 133.33333
13 Jan 1991 166.66667
14 Feb 1991 166.66667
15 Mar 1991 166.66667
16 Apr 1991 200.00000
17 May 1991 200.00000
18 Jun 1991 200.00000
19 Jul 1991 233.33333
20 Aug 1991 233.33333
21 Sep 1991 233.33333
22 Oct 1991 266.66667
23 Nov 1991 266.66667
24 Dec 1991 266.66667
25 Jan 1992 300.00000
26 Feb 1992 300.00000
27 Mar 1992 300.00000
28 Apr 1992 333.33333
29 May 1992 333.33333
30 Jun 1992 333.33333
答案 2 :(得分:0)
你可以这样做。
dummy_ts_1$GDP <- dummy_ts_1$GDP / 3
dummy_ts_1.b <- data.frame(Date=seq(as.Date("1990-02-01"),
length=30, by="1 month") - 1,
GDP=with(dummy_ts_1,
unlist(lapply(seq_along(GDP),
function(x) rep(GDP[x], 3)))))
> dummy_ts_1.b
Date GDP
1 1990-01-31 33.33333
2 1990-02-28 33.33333
3 1990-03-31 33.33333
4 1990-04-30 66.66667
5 1990-05-31 66.66667
6 1990-06-30 66.66667
7 1990-07-31 100.00000
8 1990-08-31 100.00000
9 1990-09-30 100.00000
10 1990-10-31 133.33333
11 1990-11-30 133.33333
12 1990-12-31 133.33333
13 1991-01-31 166.66667
14 1991-02-28 166.66667
15 1991-03-31 166.66667
16 1991-04-30 200.00000
17 1991-05-31 200.00000
18 1991-06-30 200.00000
19 1991-07-31 233.33333
20 1991-08-31 233.33333
21 1991-09-30 233.33333
22 1991-10-31 266.66667