从逻辑DF和布尔DF中构造DF

时间:2017-02-26 17:00:49

标签: r dataframe data.table combinations

我有两个DF 下面是布尔

> Portfolio 1 

Name | X1992 | X1993 | X1994 
XHHD   False   False   True
Jdqd   False   False   False
Jhds   False   True    False
lkaz   False   False   True                      
nqb    True    False   False
jbqs   False   False   False
jbq    False   True    False
knd    True    False   True
njvd   False   True    False
kjiz   True    False   True
khza   False   False   False
akhd   False   False   True
jkaze  True    True    False
bzae   True    False   False

第二个DF每月有一次频率

  

返回

Name | 1992/01 | 1992/02 ..... 1994/12 
XHHD   0.23       0.564
Jdqd   0.3        0.654
Jhds   0.234      0.456
lkaz   0.54       0.472               
nqb    0.99       0.761
jbqs   0.01       1.765
jbq    0.23       0.002
knd    0.59       2.32
njvd   0.123      0.43
kjiz   0.987      -0.12
khza   1.34       0.12
akhd   0.76       0.23
jkaze  0.654      0.98
bzae   0.43       0.73

我希望有一个DF,根据公司TRUE计算公司TRUE然后包括她的计算回报,根据布尔DF计算公司的月平均回报率。 结果将是

Date     Portfolio 1
1992/01   mean
1992/02   mean
1992/03   mean
1992/04   mean

1 个答案:

答案 0 :(得分:0)

我们可以通过转换为' long'格式,进行加入,然后通过'日期'

获取mean
library(data.table)
dM1 <- melt(setDT(df1), id.var = "Name")
na.omit(melt(setDT(df2), id.var = "Name", variable.name = "Date")[,
     variable := paste0("X", substring(Date, 1, 4))
      ][dM1, on = .(Name, variable)
       ][i.value == "True", .(Portfolio = mean(value, na.rm = TRUE)) , Date])
#     Date  Portfolio  
#1: 1992/01    0.7302
#2: 1992/02    0.9342

数据

 df1 <- structure(list(Name = c("XHHD", "Jdqd", "Jhds", "lkaz", "nqb", 
"jbqs", "jbq", "knd", "njvd", "kjiz", "khza", "akhd", "jkaze", 
"bzae"), X1992 = c("False", "False", "False", "False", "True", 
"False", "False", "True", "False", "True", "False", "False", 
"True", "True"), X1993 = c("False", "False", "True", "False", 
"False", "False", "True", "False", "True", "False", "False", 
"False", "True", "False"), X1994 = c("True", "False", "False", 
"True", "False", "False", "False", "True", "False", "True", "False", 
"True", "False", "False")), .Names = c("Name", "X1992", "X1993", 
"X1994"), class = "data.frame", row.names = c(NA, -14L))

df2 <-  structure(list(Name = c("XHHD", "Jdqd", "Jhds", "lkaz", "nqb", 
"jbqs", "jbq", "knd", "njvd", "kjiz", "khza", "akhd", "jkaze", 
"bzae"), `1992/01` = c(0.23, 0.3, 0.234, 0.54, 0.99, 0.01, 0.23, 
0.59, 0.123, 0.987, 1.34, 0.76, 0.654, 0.43), `1992/02` = c(0.564, 
0.654, 0.456, 0.472, 0.761, 1.765, 0.002, 2.32, 0.43, -0.12, 
0.12, 0.23, 0.98, 0.73)), .Names = c("Name", "1992/01", "1992/02"
), class = "data.frame", row.names = c(NA, -14L))