根据r中的日期计算形式

时间:2017-03-26 16:28:31

标签: r date time-series combinations panel

我有两个面板DF 第一个是布尔值,并且每年都有一个频率

标准

structure(list(Name = c("ff", "fd", "fe", "fr", "fz", "fa", "kl","ml", "az","er", "ff", "fd", "fe", "fr", "fz", "fa", "kl", "ml","az", "er"), Date =c("01/31/1992", "01/31/1992", "01/31/1992", "01/31/1992", "01/31/1992", "01/31/1992", "01/31/1992", "01/31/1992","01/31/1992", "01/31/1992", "01/31/1993", "01/31/1993", "01/31/1993","01/31/1993", "01/31/1993","01/31/1993", "01/31/1993", "01/31/1993", "01/31/1993", "01/31/1993"), Value = c("FALSE", "TRUE", "FALSE", "FALSE", "FALSE", "FALSE", "TRUE", "FALSE", "TRUE", "FALSE", "FALSE", "FALSE", "TRUE", "FALSE", "FALSE", "FALSE", "TRUE", "FALSE", "FALSE", "TRUE")), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L), .Names = c("Name", "Date", "Value"))

年:在实际例子中每年都会改变到2016年

第二个df也是一个面板表:

       Names.       Date   Value
1      ff 01/31/1992  0.2000
2      fd 01/31/1992  0.4300
3      fe 01/31/1992      NA
4      fr 01/31/1992  0.5200
5      fz 01/31/1992 -0.0020
6      fa 01/31/1992      NA
7      kl 01/31/1992  0.0010
8      ml 01/31/1992      NA
9      az 01/31/1992  0.2200
10     er 01/31/1992  0.3200
11     ff 05/31/1992  0.0000
12     fd 05/31/1992  0.0010
13     fe 05/31/1992  0.0320
14     fr 05/31/1992  0.9123
15     fz 05/31/1992  1.0000
16     fa 05/31/1992  0.3200
17     kl 05/31/1992  0.4300
18     ml 05/31/1992  0.0312
19     az 05/31/1992  0.0312
20     er 05/31/1992  0.4300
21     ff 03/31/1993  0.5300
22     fd 03/31/1993  0.8400
23     fe 03/31/1993  0.0010
24     fr 03/31/1993 -0.0123
25     fz 03/31/1993  0.4300
26     fa 03/31/1993  0.1340
27     kl 03/31/1993  0.7400
28     ml 01/31/1993  0.0312
29     az 01/31/1993  0.9324
30     er 01/31/1993  0.0600

每月变量变化直到年= 2016

哪个名称代表相同的公司名称中的“条件”,变量:每月在第一个df中的相同年份,值:数值表示公司的回报。

我想创建一个df3

Date          Portfolio1
01/31/1992     (the monthly mean of comapnies have criteria True this year) 
05/31/1992     (the monthly mean of comapnies have criteria True this year)
03/31/1993     (the monthly mean of comapnies have criteria True this year)

哪个日期与df2(每月频率)具有相同的日期,ptf1 =来自公司df2的平均月度值在该年度的df1中具有标准True。

我认为汇总的data.frame非常有用

1 个答案:

答案 0 :(得分:0)

您可以使用dplyrleft_joinsummarise合作完成此操作。我首先按日期和姓名加入。然后,我使用filter删除所有非TRUE的行。然后我group_by约会,然后用summarise计算均值。

library(dplyr)
Criteria <- structure(list(Name = c("ff", "fd", "fe", "fr", "fz", "fa", "kl","ml", "az","er", "ff", "fd", "fe", "fr", "fz", "fa", "kl", "ml","az", "er"),
Date =c("01/31/1992", "01/31/1992", "01/31/1992", "01/31/1992", "01/31/1992", "01/31/1992", "01/31/1992", "01/31/1992","01/31/1992", "01/31/1992", "01/31/1993",
"01/31/1993", "01/31/1993","01/31/1993", "01/31/1993","01/31/1993", "01/31/1993", "01/31/1993", "01/31/1993", "01/31/1993"), Value = c("FALSE", "TRUE", "FALSE",
"FALSE", "FALSE", "FALSE", "TRUE", "FALSE", "TRUE", "FALSE", "FALSE", "FALSE", "TRUE", "FALSE", "FALSE", "FALSE", "TRUE", "FALSE", "FALSE", "TRUE")),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L), .Names = c("Name", "Date", "Value"))

df2 <- read.table(text="Name    Date    Value
ff  01/31/1992  0.2
fd  01/31/1992  0.43
fe  01/31/1992  NA
fr  01/31/1992  0.52
fz  01/31/1992  -0.002
fa  01/31/1992  NA
kl  01/31/1992  0.001
ml  01/31/1992  NA
az  01/31/1992  0.22
er  01/31/1992  0.32
ff  05/31/1992  0
fd  05/31/1992  0.001
fe  05/31/1992  0.032
fr  05/31/1992  0.9123
fz  05/31/1992  1
fa  05/31/1992  0.32
kl  05/31/1992  0.43
ml  05/31/1992  0.0312
az  05/31/1992  0.0312
er  05/31/1992  0.43
ff  03/31/1993  0.53
fd  03/31/1993  0.84
fe  03/31/1993  0.001
fr  03/31/1993  -0.0123
fz  03/31/1993  0.43
fa  03/31/1993  0.134
kl  03/31/1993  0.74
ml  01/31/1993  0.0312
az  01/31/1993  0.9324
er  01/31/1993  0.06",header=TRUE, stringsAsFactors=FALSE)

left_join(df2,Criteria, by=c("Name"="Name","Date"="Date")) %>%
dplyr::filter(Value.y==TRUE) %>%
group_by(Date)%>%
summarise(Portfolio1=mean(Value.x, na.rm = TRUE))

# A tibble: 2 x 2
        Date Portfolio1
       <chr>      <dbl>
1 01/31/1992      0.217
2 01/31/1993      0.060