R:使用子集时的dcast错误 - 行大小不同

时间:2011-03-16 17:31:39

标签: r subset

当使用带有子集参数的dcast时,在原始数据框架上的dcast时会出现以下错误:子集数据帧上的dcast在行中不匹配。

data.frame(...,check.names = FALSE)出错:   参数意味着不同的行数:2,3

我使用mtcars数据集重现了错误。下面是带有复制的代码。

 library(reshape2)

 # dataframe
 mtcars2 <- mtcars[, c('vs','am','gear','carb')]
 mtcars2$cars <- row.names(mtcars)
 row.names(mtcars2) <- NULL
 mtcars2$dummyvariable <- 1

 mtcars2.melt <- melt(mtcars2, id=c('cars','vs','am','gear','carb'))

 colnames(mtcars2.melt)
 # [1] "cars"     "vs"       "am"       "gear"     "carb"     "variable" "value"   

 dcast(mtcars2.melt, vs ~ am, drop=FALSE, margins=TRUE)
 # Aggregation function missing: defaulting to length
 #     vs  0  1 (all)
 # 1     0 12  6    18
 # 2     1  7  7    14
 # 3 (all) 19 13    32

 cadillac <- subset(mtcars2.melt, regexpr('Cadillac',cars)>0)
 dcast(cadillac, vs ~ am, drop=FALSE, margins=TRUE)
 # Error in data.frame(..., check.names = FALSE) : 
 #  arguments imply differing number of rows: 2, 3

 dcast(cadillac, vs ~ am, margins=TRUE)
 #      vs 0 (all)
 # 1     0 1     1
 # 2 (all) 1     1

最后一个dcast表明跳过drop = FALSE条件可以避免错误,但是我想要的输出是

    vs 0  1 (all)
1     0 1  0   1
2     1 0  0   0
3 (all) 1  0   1

任何帮助都会很棒! :)

由于

1 个答案:

答案 0 :(得分:0)

有趣的问题!我过去曾试过这个,但无法解决。基本上我试图使用dcast导出一系列数据框(到csv),无论它们如何被子集化,它们都具有相同的尺寸。这样就可以让我在Excel或Powerpoint中干净地“连接”它们。

在运行上面编辑的代码后,尝试新的dcast仍然会出错。

> dcast(mtcars2.melt, vs ~ am, drop=FALSE, margins=TRUE, subset=.(regexpr('Cadillac',cars)>0))
Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 2, 3

and looking at my Session

> sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] plyr_1.4     reshape2_1.1

loaded via a namespace (and not attached):
[1] stringr_0.4  tools_2.12.2

----

使用DROP = F和MARGINS = T时发生错误。问题的具体原因似乎是在dcast中尝试cbind(res $ labels [[1]],data)时。在dcast中添加一些打印语句可以显示正在发生的事情:

print("printing data")
print(data)
print("printing res$labels[[1]]")
print(res$labels[[1]])
print("trying cbind(res$labels[[1]], data)")


[1] "printing data"
   0 (all) NA
1  1    NA  1
2 NA    NA NA
3  1    NA  1
[1] "printing res$labels[[1]]"
     vs
1     0
2 (all)
[1] "trying cbind(res$labels[[1]], data)"
Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 2, 3