我有两个数据框,股票排名
structure(list(Date = c("2010-01-31", "2010-02-28", "2010-03-31",
"2010-04-30", "2010-05-31", "2010-06-30"), Stock1 = c(1L, 2L,
1L, 1L, 4L, 3L), Stock2 = c(2L, 1L, 4L, 2L, 3L, 2L), Stock3 = c(3L,
3L, 3L, 3L, 2L, 1L), Stock4 = c(4L, 4L, 2L, 4L, 1L, 4L), Stock5 = c(3L,
2L, 1L, 4L, 1L, 2L), Stock6 = c(2L, 1L, 4L, 3L, 2L, 3L), Stock7 = c(1L,
2L, 3L, 4L, 1L, 2L), Stock8 = c(4L, 3L, 1L, 2L, 2L, 3L)), class = "data.frame", row.names = c(NA,
-6L))
和股票的月收益率
structure(list(Date = c("2010-01-31", "2010-02-28", "2010-03-31",
"2010-04-30", "2010-05-31", "2010-06-30"), Stock1 = c("10%",
"2%", "3%", "4%", "6%", "3%"), Stock2 = c("-2%", "4%", "-30%",
"-20%", "10%", "4%"), Stock3 = c("15%", "2%", "3%", "1%", "15%",
"6%"), Stock4 = c("7%", "19%", "29%", "3%", "1%", "4%"), Stock5 = c("2%",
"3%", "-2%", "4%", "-30%", "-20%"), Stock6 = c("19%", "29%",
"15%", "2%", "3%", "1%"), Stock7 = c("1%", "2%", "2%", "4%",
"1%", "5%"), Stock8 = c("20%", "10%", "20%", "30%", "0%", "60%"
)), class = "data.frame", row.names = c(NA, -6L))
我想要基于股票排名的平均回报,而我的最终结果将如下所示
structure(list(Date = c("2010-01-31", "2010-02-28", "2010-03-31",
"2010-04-30", "2010-05-31", "2010-06-30"), X1 = c("6%", "17%",
"7%", "4%", "-9%", "6%"), X2 = c("9%", "2%", "29%", "5%", "6%",
"-4%"), X3 = c("9%", "6%", "3%", "2%", "10%", "21%"), X4 = c("14%",
"19%", "-8%", "4%", "6%", "4%")), class = "data.frame", row.names = c(NA,
-6L))
请帮助!谢谢!
答案 0 :(得分:1)
我们将“排名”对象idxs
和字符“百分比”对象rets
称为对象。首先将那些非常不幸的字符值转换为数字:
rets[,-1] <- as.numeric(t( apply(rets[-1], 1, sub, patt="%", repl="") ))
rets[,-1] <- rets[,-1]/100
> rets
Date Stock1 Stock2 Stock3 Stock4 Stock5 Stock6 Stock7 Stock8
1 2010-01-31 0.10 -0.02 0.15 0.07 0.02 0.19 0.01 0.2
2 2010-02-28 0.02 0.04 0.02 0.19 0.03 0.29 0.02 0.1
3 2010-03-31 0.03 -0.30 0.03 0.29 -0.02 0.15 0.02 0.2
4 2010-04-30 0.04 -0.20 0.01 0.03 0.04 0.02 0.04 0.3
5 2010-05-31 0.06 0.10 0.15 0.01 -0.30 0.03 0.01 0.0
6 2010-06-30 0.03 0.04 0.06 0.04 -0.20 0.01 0.05 0.6
现在,根据连续的“等级”矩阵值将从rets
的行中选择的项均值分配给6 x 4矩阵中的每个位置:
res <- matrix(NA, 6,4);
for (rw in 1:nrow(rets) ){
for( rk in 1:4){ # cyckle through possible "ranks"
res[rw, rk]<- mean(unlist(rets[rw, which(idxs[rw,]==rk)]))
}}
> res
[,1] [,2] [,3] [,4]
[1,] 0.05500000 0.08500000 0.0850000 0.13500000
[2,] 0.16500000 0.02333333 0.0600000 0.19000000
[3,] 0.07000000 0.29000000 0.0250000 -0.07500000
[4,] 0.04000000 0.05000000 0.0150000 0.03666667
[5,] -0.09333333 0.06000000 0.1000000 0.06000000
[6,] 0.06000000 -0.03666667 0.2133333 0.04000000