在reshape2的dcast()中哪个.min?

时间:2013-03-17 18:30:56

标签: r plyr reshape reshape2

我想提取var2的值,该值对应于每个建筑月组合中var1的最小值。这是我的(假)数据集:

 head(mydata)

 #  building month      var1     var2
 #1        A     1 -26.96333 376.9633
 #2        A     1 165.38759 317.3993
 #3        A     1  47.46345 271.0137
 #4        A     2  73.47784 294.8171
 #5        A     2 107.80130 371.7668
 #6        A     2  10.16384 308.7975

可重复的代码:

## create fake data set:
set.seed(142)
mydata1 = data.frame(building = rep(LETTERS[1:5],6),month = sort(rep(1:6,5)),var1=rnorm(30,50,35),var2 = runif(30,200,400))
mydata2 = data.frame(building = rep(LETTERS[1:5],6),month = sort(rep(1:6,5)),var1=rnorm(30,60,35),var2 = runif(30,150,400))
mydata3 = data.frame(building = rep(LETTERS[1:5],6),month = sort(rep(1:6,5)),var1=rnorm(30,40,35),var2 = runif(30,250,400))
mydata = rbind(mydata1,mydata2,mydata3)
mydata = mydata[ order(mydata[,"building"], mydata[,"month"]), ]
row.names(mydata) = 1:nrow(mydata)

## here is how I pull the minimum value of v1 for each building-month combination:
require(reshape2)
m1 = melt(mydata, id.var=1:2)
d1 = dcast(m1, building ~ month, function(x) min(max(x,0), na.rm=T),
           subset = .(variable == "var1"))

这为每个建筑月组合提取了var1的最小值...

head(d1)

#  building         1         2        3         4         5         6
#1        A 165.38759 107.80130 93.32816  73.23279  98.55546 107.58780
#2        B  92.08704  98.94959 57.79610  94.10530  80.86883  99.75983
#3        C  93.38284 100.13564 52.26178  62.37837  91.98839  97.44797
#4        D  82.43440  72.43868 66.83636 105.46263 133.02281  94.56457
#5        E  70.09756  61.44406 30.78444  68.24334  94.35605  61.60610

然而,我想要的是一个数据框,其设置与d1完全相同,而是显示var2的值,该值对应于var1的最小值(如上面的d1所示)。我的直觉告诉我它应该是which.min()的变体,但是没有让它与dcast()ddply()一起使用。任何帮助表示赞赏!

1 个答案:

答案 0 :(得分:3)

这可能是一步到位,但我比plyr更熟悉reshape2,

dcast(ddply(mydata, .(building, month), summarize, value = var2[which.min(var1)]), 
      building ~ month)