如何计算R中每次迭代的每个组合

时间:2012-04-16 23:28:27

标签: r

我正在使用loglm(count~A+B+C+D+E, data=whatever)

我的问题是我想计算所有效果的所有可能组合。那就是:A和A + A:B和A + C + C:B + A:B:C:D:E等等(看似)无穷大。

有什么建议吗?

EDIT 数据看起来像

df <- structure(list(count = c(0L, 2L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L),  
A = c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,  
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L), B = c(1L, 1L, 1L, 1L, 1L,  
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,  
2L, 2L, 2L, 2L, 2L, 2L), C = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,  
2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L),  
D = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L,  
1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L), E = c(1L, 1L, 2L, 2L, 1L,  
1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L,  
2L, 1L, 1L, 2L, 2L, 1L)), .Names = c("count", "A", "B", "C", "D", "E"),  
class = "data.frame", row.names = c(NA, -29L))

我得到的问题是:

> data(SampleData)
Warning message:
In data(SampleData) : data set ‘SampleData’ not found
> fm1 <- loglm(count ~ ., data = SampleData)
> dd <- dredge(fm1)
Error in rownames(ct)[match(names(coef1), rownames(ct))] <- fxdCoefNames : 
  NAs are not allowed in subscripted assignments
In addition: Warning messages:
1: In table(fac) : attempt to set an attribute on NULL (model 1 skipped)
2: In data[do.call("cbind", lapply(fac, as.numeric))] <- rsp :
  number of items to replace is not a multiple of replacement length
3: In st[do.call("cbind", lapply(fac, as.numeric))] <- exp(offset) :
  number of items to replace is not a multiple of replacement length
4: In double(nmar) : vector size cannot be NA/NaN (model 2 skipped)
5: In data[do.call("cbind", lapply(fac, as.numeric))] <- rsp :
  number of items to replace is not a multiple of replacement length
6: In st[do.call("cbind", lapply(fac, as.numeric))] <- exp(offset) :
  number of items to replace is not a multiple of replacement length
7: In double(nmar) : vector size cannot be NA/NaN (model 3 skipped)
> subset(dd, delta < 4)
Error in subset(dd, delta < 4) : object 'dd' not found

2 个答案:

答案 0 :(得分:1)

我相信这会让你想要的,

install.packages('MuMIn', dependencies = TRUE)
library(MuMIn)    

Burnham and Anderson(2002)的例子,第100页:(取自?dredge

data(Cement)
fm1 <- lm(y ~ ., data = Cement)
dd <- dredge(fm1)
subset(dd, delta < 4)

您只需将lm(y ~替换为loglm(count~,并从数据中删除所有无法解释的变量。

答案 1 :(得分:1)

正如我经常说的那样,“你想解决的问题是什么?”大概你实际上并不需要所有那些2 ^ N的结果,那么你在寻找什么呢?也许你想要某种筛子来对结果产生最强烈,错误...效果的影响归零:

顺便说一句,您可能想与Eureqa一起使用http://creativemachines.cornell.edu/eureqa的一个包。