我想使用另一个DNAStringSets
来拆分list
的列表。我找到了这个问题R: split elements of a list into sublists,并为我的问题建立了一个例子:
library("DECIPHER")
library("Biostrings")
aDNAStringSet <- DNAStringSet(c("GCAATCCATTAC", "AAATCGCCATCC", "GCATACCTTAAC", "GCATACCATTAC", "AGCATACCTTAC", "AGCATACCTTAC", "AGCATACCTTAA", "AGCATACCTAAC","GCAATCCATTAC", "AAATCGCCATCC", "GCATACCTTAAC", "GCATACCATTAC", "AGCATACCTTAC", "AGCATACCTTAC", "AGCATACCTTAA", "AGCATACCTAAC"))
names(aDNAStringSet) <- c("seq1", "seq2", "seq3", "seq4", "seq5", "seq6", "seq7", "seq8", "seq9", "seq10", "seq11", "seq12", "seq13", "seq14", "seq15", "seq16")
aDNAStringSet
我们制作了一个DNAStringSet,其名称锁定如下:
> aDNAStringSet
A DNAStringSet instance of length 16
width seq names
[1] 12 GCAATCCATTAC seq1
[2] 12 AAATCGCCATCC seq2
[3] 12 GCATACCTTAAC seq3
[4] 12 GCATACCATTAC seq4
[5] 12 AGCATACCTTAC seq5
... ... ...
[12] 12 GCATACCATTAC seq12
[13] 12 AGCATACCTTAC seq13
[14] 12 AGCATACCTTAC seq14
[15] 12 AGCATACCTTAA seq15
[16] 12 AGCATACCTAAC seq16
现在,我将随机分组以将它们分为:
group <- c(rep(1,3), rep(2,5), rep(3,4), rep(4,4))
names <- c("seq1", "seq2", "seq3", "seq4", "seq5", "seq6", "seq7", "seq8", "seq9", "seq10", "seq11", "seq12", "seq13", "seq14", "seq15", "seq16")
sort <- data.frame(cbind(group, names))
并按组划分:
bygroup <- split(aDNAStringSet, f = sort$group)
看起来像:
> bygroup
DNAStringSetList of length 4
[["1"]] seq1=GCAATCCATTAC seq2=AAATCGCCATCC seq3=GCATACCTTAAC
[["2"]] seq4=GCATACCATTAC seq5=AGCATACCTTAC seq6=AGCATACCTTAC seq7=AGCATACCTTAA seq8=AGCATACCTAAC
[["3"]] seq9=GCAATCCATTAC seq10=AAATCGCCATCC seq11=GCATACCTTAAC seq12=GCATACCATTAC
[["4"]] seq13=AGCATACCTTAC seq14=AGCATACCTTAC seq15=AGCATACCTTAA seq16=AGCATACCTAAC
现在我再次Adjust
序列:
Adjusted <- lapply(bygroup, FUN=AdjustAlignment,processors = NULL)
看起来像:
> Adjusted
$`1`
A DNAStringSet instance of length 3
width seq names
[1] 12 GCAATCCATTAC seq1
[2] 12 AAATCGCCATCC seq2
[3] 12 GCATACCTTAAC seq3
$`2`
A DNAStringSet instance of length 5
width seq names
[1] 12 GCATACCATTAC seq4
[2] 12 AGCATACCTTAC seq5
[3] 12 AGCATACCTTAC seq6
[4] 12 AGCATACCTTAA seq7
[5] 12 AGCATACCTAAC seq8
$`3`
A DNAStringSet instance of length 4
width seq names
[1] 12 GCAATCCATTAC seq9
[2] 12 AAATCGCCATCC seq10
[3] 12 GCATACCTTAAC seq11
[4] 12 GCATACCATTAC seq12
$`4`
A DNAStringSet instance of length 4
width seq names
[1] 12 AGCATACCTTAC seq13
[2] 12 AGCATACCTTAC seq14
[3] 12 AGCATACCTTAA seq15
[4] 12 AGCATACCTAAC seq16
后面紧跟DistanceMatrix
和IdClusters
定义新的群集以进行进一步拆分。
D <- lapply(Adjusted, FUN=DistanceMatrix,processors = NULL)
Clust <- lapply(D, FUN=IdClusters, method="NJ",cutoff=c(0.15), showPlot=TRUE, type="clusters")
Clust
如下:
> Clust
$`1`
cluster
seq1 2
seq2 1
seq3 3
$`2`
cluster
seq4 1
seq5 3
seq6 3
seq7 3
seq8 2
$`3`
cluster
seq9 3
seq10 1
seq11 2
seq12 4
$`4`
cluster
seq13 1
seq14 2
seq15 1
seq16 2
现在我想使用Adjusted
和Clust
根据lapply
来拆分split
列表
byClust <- lapply(Adjusted,FUN=split, Clust$cluster)
但是我得到了错误:
> byClust <- lapply(Adjusted,FUN=split, Clust$cluster)
Error in normSplitFactor(f, x) :
split factor has length 0 but 'NROW(x)' is > 0
两个列表的长度相同。可能是什么问题呢?有什么主意吗?