Question

我想基于公共密钥拆分两个对象，应用需要两个对象的函数，然后将结果传回。

任何给定键（t）的每个对象的长度不一定相同，这意味着我无法将它们组合成一个对象（好吧，至少我不会这样做）看看如何做到这一点。）

一些玩具数据：

set.seed(2)

supply = data.frame( t=c(rep(1,10),rep(2,8)) , 
                     p=c(cumsum(runif(10)),cumsum(runif(8))) ,
                     q=c(cumsum(runif(10)),cumsum(runif(8))) )
demand = data.frame( t=c(rep(1,8),rep(2,9))  , 
                     p=c(cumsum(runif(8)),cumsum(runif(9)))  , 
                     q=c(6-cumsum(runif(8)),6-cumsum(runif(9))) )

获得数据后，我想通过键t将其拆分，找出两者相交的位置，然后返回均衡p和q。一个图形示例

plot( y=supply$p[supply$t==1],x=supply$q[supply$t==1],type="s",col="blue")
lines(y=demand$p[demand$t==1],x=demand$q[demand$t==1],type="S",col="red")

这构成了优化功能的一部分，因此必须尽可能快。我很高兴使用apply，plyr和data.table来完成这项工作。

提前致谢

Answer 1

您可以使用split和mapply：

#an example function
#it could be further optimized for speed
myfun <- function(A, B) {
  coef1 <- coef(lm(p~q, data=A))
  coef2 <- coef(lm(p~q, data=B))
  x <- (coef1[1]-coef2[1])/(coef2[2]-coef1[2])
  y <- coef1[1]+coef1[2]*x
  setNames(c(x, y), c("x", "y"))
}

myfun(supply[supply$t==1,], demand[demand$t==1,])
#       x        y 
#2.106726 2.688992 

split_supply <- split(supply, supply$t)
split_demand <- split(demand, demand$t)

mapply(myfun, split_supply, split_demand)
#         1        2
#x 2.106726 3.161048
#y 2.688992 3.357424

PS：如果要使用线性模型来估计交点，可以使用data.table首先按t计算系数，然后合并结果data.tables并随后计算交点。

Answer 2

你的问题有一些未知数。最简单的情况是，如果供应和需求可以有一个data.frame（即根据t调整供给和需求），则需要myFun函数来获取此data.frame的一部分作为参数并返回一个对象，为其定义了lines方法。在这种情况下，你可以简单地做：

lapply(split(data, data$t), function(subset) lines(myFun(subset)))

现在您的数据可能不是这样。因此，以下方法适合您：

# split supply and demand into lists with values of t being list keys
# these splits are independent and are not aligned with respect to t
supply = split(supply[, 2:3], supply$t)
demand = split(demand[, 2:3], demand$t)

# get a merged set of all keys
keys = unique(c(names(supply), names(demand)))
# alternatively consider an intersect, 
# then you do not need to check if both lists have key, but then you just do not know what is left out
# keys = intersect(names(supply), names(demand))

keys = sort(keys)

# produce an empty plot box spanning over the expected total range of data
plot(c(xmin, xmax), c(ymin, ymax), type="n")

for (key in keys) {
    s = supply[[key]]
    d = demand[[key]]
    # if both supply and demand have current key t
    # you do not need this check if you used intersect
    if (!is.null(s) && !is.null(d)) {
        # assuming myFun takes two arguments and returns a list with names x, y
        data = myFun(s, d)
        lines(data$x, data$y)
    }
}

如果你想要一个包含所有t的（x，y）对的数据结构，那么使用：

sapply(keys, function(key) {
    s = supply[[key]]
    d = demand[[key]]
    data = myFun(s, d)
    c(data$x, data$y)
})

这应该返回一个矩阵（有两行或两列 - 现在没有R来检查），然后您可以使用colnames(res) = keys分配名称（或分别为rownames(res) = keys < / p>

在我上面提到的简单案例中，返回这样一个结构的整个事情会更简单：

sapply(split(data, data$t), function(subset) myFun(subset))

在R中快速分割，应用和组合两个对象的方法

2 个答案: