在R中使用expand.grid以y

时间:2016-06-30 13:32:25

标签: r dataframe combinatorics

是否可以在R中使用expand.grid()来创建y集合中所有可能的x因子组合?

例如,我有12个因素:

Factor1 = c("1", "2", "3", "4"),       #Fixed Attribute: 4 lvls
Factor2 = c("5", "6", "7", "8", "9"),  #Fixed Attribute: 5 lvls
Factor3 = c("10", "11", "12","13"),    #Fixed Attribute: 4 lvls
Factor4 = c("14", "15", "16"),         #Fixed Attribute: 4 lvls
Factor5 = c("17", "18", "19", "20", "21"),  #Variable Attribute: 5 lvls
Factor6 = c("22", "23"),                    #Variable Attribute: 2 lvls
Factor7 = c("24", "25", "26"),              #Variable Attribute: 3 lvls
Factor8 = c("27", "28", "29")               #Variable Attribute: 3 lvls
Factor9 = c("30", "31", "32", "33"),        #Variable Attribute: 4 lvls
Factor10= c("34", "35"),                    #Variable Attribute: 2 lvls
Factor11 = c("36", "37", "38"),             #Variable Attribute: 3 lvls
Factor12 = c("39", "40", "41")              #Variable Attribute: 3 lvls

我希望始终在expand.grid()中包含前4个(即它们是固定的),并在所有可能的4个集合中循环到最后8个,这等于70个唯一集合。然后附加所有70个数据帧。

我可以通过创建70个不同的expand.grid()代码块来实现这种蛮力方式,但是有没有太技术优雅的方法来做到这一点?

例如蛮力方式如下:

expand.grid(Factor1, Factor2,Factor3,Factor4,Factor5,Factor6,Factor7,Factor8)
expand.grid(Factor1, Factor2,Factor3,Factor4,Factor5,Factor6,Factor7,Factor9)
expand.grid(Factor1, Factor2,Factor3,Factor4,Factor5,Factor6,Factor7,Factor10)
expand.grid(Factor1, Factor2,Factor3,Factor4,Factor5,Factor6,Factor7,Factor11)
expand.grid(Factor1, Factor2,Factor3,Factor4,Factor5,Factor6,Factor7,Factor12)
....etc...

所以我最终会得到70个不同的数据帧,因为有70种独特的方法可以从因子4-12中选择4个因子(即70种方法从8个列表中选择4个项目)

此外,我得到的数据框可能是150万行。这会导致记忆问题吗?

谢谢,

1 个答案:

答案 0 :(得分:1)

如果我理解你的话,这应该做你想做的事:

l <- list(
    Factor1 = c("1", "2", "3", "4"),       #Fixed Attribute: 4 lvls
    Factor2 = c("5", "6", "7", "8", "9"),  #Fixed Attribute: 5 lvls
    Factor3 = c("10", "11", "12","13"),    #Fixed Attribute: 4 lvls
    Factor4 = c("14", "15", "16"),         #Fixed Attribute: 4 lvls
    Factor5 = c("17", "18", "19", "20", "21"),  #Variable Attribute: 5 lvls
    Factor6 = c("22", "23"),                    #Variable Attribute: 2 lvls
    Factor7 = c("24", "25", "26"),              #Variable Attribute: 3 lvls
    Factor8 = c("27", "28", "29"),               #Variable Attribute: 3 lvls,
    Factor9 = c("30", "31", "32", "33"),        #Variable Attribute: 4 lvls
    Factor10= c("34", "35"),                    #Variable Attribute: 2 lvls
    Factor11 = c("36", "37", "38"),             #Variable Attribute: 3 lvls
    Factor12 = c("39", "40", "41")              #Variable Attribute: 3 lvls
)



# Get the names of the other 8
others <- names(l)[-c(1:4)]
# Get names of the 4 fixed ones
fixed <- names(l)[1:4]

# Get all combinations of 4 of names of the others
combos <- combn(others, 4)

# Get the list of 70 expand grid outputs of combinations (fixed, combo_of_4)
out <- apply(combos, 2, function(x) expand.grid(l[c(fixed,x)]))