我有两个DF 下面是布尔
> Portfolio 1
Name | X1992 | X1993 | X1994
XHHD False False True
Jdqd False False False
Jhds False True False
lkaz False False True
nqb True False False
jbqs False False False
jbq False True False
knd True False True
njvd False True False
kjiz True False True
khza False False False
akhd False False True
jkaze True True False
bzae True False False
第二个DF每月有一次频率
返回
Name | 1992/01 | 1992/02 ..... 1994/12
XHHD 0.23 0.564
Jdqd 0.3 0.654
Jhds 0.234 0.456
lkaz 0.54 0.472
nqb 0.99 0.761
jbqs 0.01 1.765
jbq 0.23 0.002
knd 0.59 2.32
njvd 0.123 0.43
kjiz 0.987 -0.12
khza 1.34 0.12
akhd 0.76 0.23
jkaze 0.654 0.98
bzae 0.43 0.73
我希望有一个DF,根据公司TRUE计算公司TRUE然后包括她的计算回报,根据布尔DF计算公司的月平均回报率。 结果将是
Date Portfolio 1
1992/01 mean
1992/02 mean
1992/03 mean
1992/04 mean
答案 0 :(得分:0)
我们可以通过转换为' long'格式,进行加入,然后通过'日期'
获取mean
library(data.table)
dM1 <- melt(setDT(df1), id.var = "Name")
na.omit(melt(setDT(df2), id.var = "Name", variable.name = "Date")[,
variable := paste0("X", substring(Date, 1, 4))
][dM1, on = .(Name, variable)
][i.value == "True", .(Portfolio = mean(value, na.rm = TRUE)) , Date])
# Date Portfolio
#1: 1992/01 0.7302
#2: 1992/02 0.9342
df1 <- structure(list(Name = c("XHHD", "Jdqd", "Jhds", "lkaz", "nqb",
"jbqs", "jbq", "knd", "njvd", "kjiz", "khza", "akhd", "jkaze",
"bzae"), X1992 = c("False", "False", "False", "False", "True",
"False", "False", "True", "False", "True", "False", "False",
"True", "True"), X1993 = c("False", "False", "True", "False",
"False", "False", "True", "False", "True", "False", "False",
"False", "True", "False"), X1994 = c("True", "False", "False",
"True", "False", "False", "False", "True", "False", "True", "False",
"True", "False", "False")), .Names = c("Name", "X1992", "X1993",
"X1994"), class = "data.frame", row.names = c(NA, -14L))
df2 <- structure(list(Name = c("XHHD", "Jdqd", "Jhds", "lkaz", "nqb",
"jbqs", "jbq", "knd", "njvd", "kjiz", "khza", "akhd", "jkaze",
"bzae"), `1992/01` = c(0.23, 0.3, 0.234, 0.54, 0.99, 0.01, 0.23,
0.59, 0.123, 0.987, 1.34, 0.76, 0.654, 0.43), `1992/02` = c(0.564,
0.654, 0.456, 0.472, 0.761, 1.765, 0.002, 2.32, 0.43, -0.12,
0.12, 0.23, 0.98, 0.73)), .Names = c("Name", "1992/01", "1992/02"
), class = "data.frame", row.names = c(NA, -14L))