我想使用adegenet
包来运行遗传数据分析。为此,我需要将我的fasta文件转换为adegenet识别的genid文件。
我尝试以两种不同的方式输入数据,结果相同。
>mydata.fasta <- fasta2DNAbin("~/Desktop/blattodeatest/Cryptocercuspunctulatus/COII.afa")
> mydata.fasta
22 DNA sequences in binary format stored in a matrix.
All sequences of same length: 404
Labels: AB425873_COII AB425877_COII AB425878_COII AB425876_COII AB425880_COII AB425884_COII ...
Base composition:
a c g t
0.404 0.181 0.085 0.329
>mydata.dna <- read.dna("~/Desktop/blattodeatest/Cryptocercus punctulatus/COII.afa", format="fasta")
> mydata.dna
22 DNA sequences in binary format stored in a matrix.
All sequences of same length: 404
Labels: AB425873_COII AB425877_COII AB425878_COII AB425876_COII AB425880_COII AB425884_COII ...
Base composition:
a c g t
0.404 0.181 0.085 0.329
然后我试图转换数据但得到奇怪的结果。
>mydata.genind <- DNAbin2genind(mydata.fasta)
>mydata.genind
/// GENIND OBJECT /////////
// 22 individuals; 91 loci; 189 alleles; size: 62.3 Kb
// Basic content
@tab: 22 x 189 matrix of allele counts
@loc.n.all: number of alleles per locus (range: 2-3)
@loc.fac: locus factor for the 189 columns of @tab
@all.names: list of allele names for each locus
@ploidy: ploidy of each individual (range: 1-1)
@type: codom
@call: DNAbin2genind(x = mydata.dna)
// Optional content
- empty -
我的数据中只有一个404bp基因座,看起来fasta文件正在被正确读取。我无法弄清楚为什么在我使用DNAbin2genind后R认为有91个基因座?
答案 0 :(得分:0)
原因是genind对象将基因中的每个多态位置视为基因座。获得多态的位置。在原始对齐类型中:
as.vector(mydata.genind@loc.names,mode="numeric")