我需要从共同的30个项目中选择10个不同的15个项目组合,并有几个约束条件。
第一个约束:对于10个组合中的每一个,我需要删除在该特定组合中不应存在的三个项目。让我们说第一个组合的wf01,wf02和wf03;第二个组合的wf04,wf05和wf06等等。
第二个约束:在十个组合中,每个项目应该存在相同的次数(5)。
第三个限制:十个组合应尽可能彼此不同(即他们应尽量避免共享共同项目)。
我目前正在Excel中这样做,但我想在R中应该有一种更有效的方法!
到目前为止,这是我的编码:
# Data: 30 items (pictures to choose from)
PictureNames <- c("wf01", "wf02", "wf03", "wf04", "wf05", "wf06", "wf07", "wf08", "wf09", "wf10", "wf11", "wf12", "wf13", "wf14", "wf15", "wf16", "wf17", "wf18", "wf19", "wf20", "wf21", "wf22", "wf23", "wf24", "wf25", "wf26", "wf27", "wf28", "wf29", "wf30")
set.seed(10)
sample01 <- sample(PictureNames[4:30], 15)
sample02 <- sample(PictureNames[c(1:3, 7:30)], 15)
sample03 <- sample(PictureNames[c(1:6, 10:30)], 15)
sample04 <- sample(PictureNames[c(1:9, 13:30)], 15)
sample05 <- sample(PictureNames[c(1:12, 16:30)], 15)
sample06 <- sample(PictureNames[c(1:15, 19:30)], 15)
sample07 <- sample(PictureNames[c(1:18, 22:30)], 15)
sample08 <- sample(PictureNames[c(1:21, 25:30)], 15)
sample09 <- sample(PictureNames[c(1:24, 28:30)], 15)
sample10 <- sample(PictureNames[c(1:27)], 15)
到目前为止,我在Excel中手动实现了约束(我知道......)。首先,我确保每个项目在10种不同组合中仅出现5次。然后,我创建了一个矩阵,其中图片的名称为行和列。每个细胞计算在10种组合中相同组合中存在一对图片的次数。值范围从0(图片从不在一起)到5(图片总是在一起)。然后,通过在组合之间手动移动图片,我设法避免在此矩阵中存在0和5。之后,我实现了一个优化任务,其中每个图片应该与尽可能多的不同图片相关联,同时尊重先前的约束。为此,我为每张图片计算其列(或行)值的标准偏差。然后,我计算了这些标准偏差的标准偏差。在保持尊重其他约束的同时减少该值的任何移位。我设法达到0.036的标准偏差,但我不知道这是否是我能得到的最好结果。任何帮助将不胜感激!!!
提前致谢!
在@ roman-luštrik的评论之后,这是我的新版本的代码。仍然可怕,并没有从优化矩阵那么近,但慢慢到达那里!
# Data: 30 items (pictures to choose from)
PictureNames <- data.frame(Names = c("wf01", "wf02", "wf03", "wf04", "wf05",
"wf06", "wf07", "wf08", "wf09", "wf10", "wf11", "wf12", "wf13",
"wf14", "wf15", "wf16", "wf17", "wf18", "wf19", "wf20", "wf21",
"wf22", "wf23", "wf24", "wf25", "wf26", "wf27", "wf28", "wf29",
"wf30"), stringsAsFactors = FALSE)
library(dplyr)
set.seed(10)
sample01 <- data.frame(Names = sort(sample(PictureNames$Names[c(4:30)], 15)))
sample02 <- anti_join(PictureNames, sample01)
sample03 <- data.frame(Names = sort(sample(PictureNames$Names[c(1:6, 10:30)], 15)))
sample04 <- anti_join(PictureNames, sample03)
sample05 <- data.frame(Names = sort(sample(PictureNames$Names[c(1:12, 16:30)], 15)))
sample06 <- anti_join(PictureNames, sample05)
sample07 <- data.frame(Names = sort(sample(PictureNames$Names[c(1:18, 22:30)], 15)))
sample08 <- anti_join(PictureNames, sample07)
sample09 <- data.frame(Names = sort(sample(PictureNames$Names[c(1:24, 28:30)], 15)))
sample10 <- anti_join(PictureNames, sample09)
AllSamples <- data.frame(sample01, sample02, sample03, sample04, sample05,
sample06, sample07, sample08, sample09, sample10)
colnames(AllSamples) <- c("sA", "sB", "sC", "sD", "sE", "sF", "sG", "sH", "sI", "sJ")
CoOccurMat <- matrix(nrow = length(PictureNames$Names), ncol = length(PictureNames$Names),
dimnames = c(PictureNames, PictureNames))
for(j in 1:length(PictureNames$Names)){
for(i in 1:length(PictureNames$Names)){
CoOccurMat[i,j] <- sum(ifelse(sapply(PictureNames,
function(x)colSums(xor(AllSamples == PictureNames$Names[j],
AllSamples == PictureNames$Names[i]))) == 2, 1, 0))
ColSd <- apply(CoOccurMat, 2, sd)
OverallSd <- sd(ColSd)
}
}
CoOccurMat
ColSd
OverallSd