我想扩展compare multiple vectors of different lengths, count the elements that are the same, and print out those that are the same and different的工作范围。我想编写一个循环,以便可以对所有10种不同的向量进行成对比较,以找出每种比较中所有可能的成对选择所共有的向量。下面的伪代码的主要比较部分正在工作,并且来自上一篇文章,但这只是比较A和B,我想将A与C,A与#D,B与C等进行比较。 ..
vectors to be compared: A, B, C, D, E, F, G, H, I, J
set global variable for first vector to be compared
set global variable for second vector to be compared
#vetors -- these are subsets of my real vectors, which are more like 50 - 200 elements long
A <- c("866_78", "1137_78", "1721_78", "1745_79", "1910_79", "1972_76",
"2776_77", "3049_79", "3084_15", "3995_78", "4066_33", "4431_15")
B <- c("866_78", "1137_78", "1721_78", "1745_79", "1910_79", "1972_76",
"2776_77",
"3049_79", "3084_15", "3995_78")
C <- c("866_78", "1137_78", "1910_79", "1972_76", "2776_77",
"3049_79", "3084_14", "3995_78", "4066_36", "4431_19", "4885_78")
D <- c("866_78", "1137_78", "1721_78", "1745_79", "1910_79", "1972_76",
"2773_77",
"3049_79", "3084_12", "3995_78", "4066_36", "4431_19", "4885_78")
E <- c("866_78", "1137_78", "1721_78", "1745_79", "1910_79", "1972_76",
"2776_77",
"3049_79", "3084_17", "4431_19", "4885_78")
F <- c("868_78", "1137_78", "1721_78", "1745_79", "1910_79", "1972_76",
"2776_77",
"3049_79", "3084_15", "3995_78", "4066_36", "4431_19", "4885_78")
G <- c("866_78", "1837_78", "1721_78", "1972_76", "2776_77",
"3049_79", "3084_15", "3995_78", "4066_36", "4431_19", "4885_78")
H <- c("866_78", "1137_78", "1721_78", "1745_79", "1910_79", "1972_76",
"2776_77",
"3049_79", "3084_15", "3995_78", "4066_36", "4431_19", "4885_78")
I <- c("866_78", "1137_28", "1721_78", "1745_79", "1910_79", "1972_76",
"2776_77",
"3995_78", "4066_36", "4431_19", "4885_78")
J <- c("866_78", "1137_78", "1721_78", "1745_79", "1910_79", "1972_76",
"2776_77",
"3049_79", "3084_18", "3995_78", "4066_36", "4431_19", "4885_78")
for(i ???)
{
compare.SNPs <- function(A, B) {
# consider only unique names
A.u <- unique(A)
B.u <- unique(B)
common.A.B <- intersect(A.u, B.u)
diff.A.B <- setdiff(A.u, B.u)
diff.B.A <- setdiff(B.u, A.u)
uncommon.A.B <- union(diff.A.B, diff.B.A)
cat(paste0("The sets have ", length(common.A.B), " SNPs in common:"))
print(common.A.B)
print(paste0("The sets have ", length(uncommon.A.B), " SNPs not in
common:"))
print(paste0("In the first set, but not in the second set:"))
print(diff.A.B)
print(paste0("Not in the first set, but in the second set:"))
print(diff.B.A)
}
compare.SNPs(A,B)
}
对于示例代码的任何指导,将不胜感激。
此致, 埃拉
答案 0 :(得分:2)
xx<-combn(LETTERS[1:10],2)
for (i in 1:dim(xx)[2]) {
cat(paste0("Comparing ", xx[1,i], " and ", xx[2,i],": "))
compare.SNPs(get(xx[1,i]),get(xx[2,i]))
}
(而且,function()
调用没有理由位于for循环内)。
答案 1 :(得分:0)
首先,您应该告诉函数compare.SNPs
在return
中给您一些东西,而不仅仅是打印。这是通过return(list(common.A.B, diff.A.B, diff.B.A))
compare.SNPs <- function(A, B) {
# consider only unique names
A.u <- unique(A)
B.u <- unique(B)
common.A.B <- intersect(A.u, B.u)
diff.A.B <- setdiff(A.u, B.u)
diff.B.A <- setdiff(B.u, A.u)
uncommon.A.B <- union(diff.A.B, diff.B.A)
cat(paste0("The sets have ", length(common.A.B), " SNPs in common:"))
print(common.A.B)
print(paste0("The sets have ", length(uncommon.A.B), " SNPs not in
common:"))
print(paste0("In the first set, but not in the second set:"))
print(diff.A.B)
print(paste0("Not in the first set, but in the second set:"))
print(diff.B.A)
return(list(common.A.B, diff.A.B, diff.B.A))
}
所以现在此函数将return
变成list
,其中包含3个元素:两个向量的公共元素,然后是2组差异。
mylist <- list(A, B, C, D, E, F, G, H, I, J)
allcomparisons <- list()
for(i in 1:length(mylist))
{
for(j in 1:length(mylist)) {
allcomparisons <- c(allcomparisons, compare.SNPs(mylist[[i]], mylist[[j]]))
}
}
比较A与J比较10x10个元素,总共可以得到300个元素:首先是A和A的共同元素,然后是A和A之间的区别,然后是A和A之间的区别,然后是共同元素A和B等。
然后,您可以像访问其他列表一样访问allcomparisons
的元素。例如,您可以检查all(allcomparisons[[4]]==compare.SNPs(A,B)[[1]])
[1] TRUE