如何计算没有循环的加权值?

时间:2013-11-02 13:20:48

标签: r lapply sapply mapply

很抱歉提出琐碎的问题。这是我的示例数据:

(x <- data.frame(period=c('20130101','20130102'),symbol=c('x1','x2'),V1=c(1,2),V2=c(3,4)))
(y <- data.frame(period=c('20130101','20130101','20130102','20130102'),
            symbol=rep(c('V1','V2'),2),w1=rep(c(0.5,0.5),2),w2=rep(c(0.3,0.7),2),
            w3=rep(c(0.2,0.8),2) ) )

对于给定的日期和符号,表'x'中有两个值(V1,V2)。

    period symbol V1   V2
1 20130101   x1    1   3
2 20130102   x2    2   4

对于给定的一天,每个值(V1,V2)具有三组权重(w1,w2,w3)。

    period symbol  w1  w2  w3
1 20130101   V1   0.5 0.3 0.2
2 20130101   V2   0.5 0.7 0.8
3 20130102   V1   0.5 0.3 0.2
4 20130102   V2   0.5 0.7 0.8

如何通过两个表没有循环计算加权值?**例如在'20130101'中,'x1'的V1和V2分别为1和3。然后在表'y'搜索日期'20130101'和V1和V2,我们得到3组权重。加权值计算如下:

    wv1=1*0.5 + 3*0.5=2
    wv2=1*0.3 + 3*0.7=2.4
    wv3=1*0.2 + 3*0.8=2.6

结果表如下所示:

    period symbol  wv1  wv2   wv3
1 20130101   x1    2    2.4   2.6
...

3 个答案:

答案 0 :(得分:2)

这是一种方法:

cbind(x[,1:2],Reduce(`+`,lapply(split(y,y$symbol),
  function(z) x[,as.character(z$symbol[1])]*z[,3:5])))
##     period symbol w1  w2  w3
## 1 20130101     x1  2 2.4 2.6
## 3 20130102     x2  3 3.4 3.6

这可以扩展,具体取决于您想要的解决方案的一般情况。例如,如果您不能假设xy按周期排序并且一对一匹配,那么您需要先调用match

ysplit<-lapply(split(y,y$symbol),function(z) z[match(x$period,z$period),])
vals<-Reduce(`+`,lapply(ysplit,function(z) x[,as.character(z$symbol[1])]*z[,3:5]))
cbind(x[,1:2],vals)

如果您想获得正确的行名和列名:

cbind(x[,1:2],setNames(vals,c("wv1","wv2","wv3")),row.names=row.names(x))
##     period symbol wv1 wv2 wv3
## 1 20130101     x1   2 2.4 2.6
## 2 20130102     x2   3 3.4 3.6

如果您想要一个采用任意数量权重的通用解决方案,您可以通过一些小的改动来适应这个:

y$w4<-c(0.1,0.9,0.1,0.9)
ysplit<-lapply(split(y,y$symbol),function(z) z[match(x$period,z$period),])

vals<-Reduce(`+`,lapply(ysplit,function(z)
      x[,as.character(z$symbol[1])]*z[,substring(names(z),1,1)=="w"]))
cbind(x[,1:2],setNames(vals,paste0("wv",1:ncol(vals))),row.names=row.names(x))

##     period symbol wv1 wv2 wv3 wv4
## 1 20130101     x1   2 2.4 2.6 2.8
## 2 20130102     x2   3 3.4 3.6 3.8

答案 1 :(得分:1)

使用mapply的方式(假设您只有V1V2V1列是第一个):

#function to apply multiple arguments
fun <- function(period., symbol., V1., V2.) 
{
 row1. <- y$period == as.numeric(as.character(period.)) & y$symbol == "V1"
 row2. <- y$period == as.numeric(as.character(period.)) & y$symbol == "V2"

 res <- y[row1.,c("w1", "w2", "w3")] * V1. + y[row2.,c("w1", "w2", "w3")] * V2.

 ret <- c(period = as.numeric(as.character(period.)), 
            symbol = as.character(symbol.), 
               setNames(res, c("wv1", "wv2", "wv3")))
 return(ret)
}

do.call(rbind, mapply(fun, x$period, x$symbol, x$V1, x$V2, SIMPLIFY = F))
#     period   symbol wv1 wv2 wv3
#[1,] 20130101 "x1"   2   2.4 2.6
#[2,] 20130102 "x2"   3   3.4 3.6

答案 2 :(得分:1)

One way: 
l<-reshape(y,idvar="period",timevar="symbol",direction="wide")
> l
    period w1.V1 w2.V1 w3.V1 w1.V2 w2.V2 w3.V2
1 20130101   0.5   0.3   0.2   0.5   0.7   0.8
3 20130102   0.5   0.3   0.2   0.5   0.7   0.8

nn<-merge(l,x,by="period")
> nn
    period w1.V1 w2.V1 w3.V1 w1.V2 w2.V2 w3.V2 symbol V1 V2
1 20130101   0.5   0.3   0.2   0.5   0.7   0.8     x1  1  3
2 20130102   0.5   0.3   0.2   0.5   0.7   0.8     x2  2  4


nn$wv1<-with(nn,w1.V1*V1+w1.V2*V2)
nn$wv2<-with(nn,w2.V1*V1+w2.V2*V2)
 nn$wv3<-with(nn,w3.V1*V1+w3.V2*V2)

 nn
    period w1.V1 w2.V1 w3.V1 w1.V2 w2.V2 w3.V2 symbol V1 V2 wv1 wv2 wv3
1 20130101   0.5   0.3   0.2   0.5   0.7   0.8     x1  1  3   2 2.4 2.6
2 20130102   0.5   0.3   0.2   0.5   0.7   0.8     x2  2  4   3 3.4 3.6



   nn[,c(1,8,11:13)]
    period symbol wv1 wv2 wv3
1 20130101     x1   2 2.4 2.6
2 20130102     x2   3 3.4 3.6