Tax4Fun - 错误长度导入数据

时间:2018-02-28 16:06:56

标签: r biom

我试图运行R package Tax4Fun来预测16S数据的功能。由于metabarcoding pipeline(VSEARCH,CUTADAPT和SWARM),我获得了一张OTU表。我的OTU表看起来像这样:

OTU     total   cloud   amplicon        length  abundance       chimera spread  quality sequence      identity taxonomy        references        PM-18S6ext_TATGCG_p16S  PM-18SA_TTTTTC_p16S     PM-18SB_GTCGTG_p16S    PM-18SD_GCTATC_p16S     PM-18SLAZ_TTTGTA_p16S   PM-18SMIS_AGCCTG_p16S   PM-18SQN_CAACAG_p16S  PM-finaMIS_CTCTCG_p16S   PM-finaTfA_TGCTGA_p16S  PM-final6ext_GGTAGC_p16S        PM-finalA_AATATG_p16S PM-finalB_TGAGCA_p16S    PM-finalD_AATCAC_p16S   PM-finalLAZ_GCAAAT_p16S PM-finalQN_AAATTG_p16S  PM-finalT041p_CTGTGC_p16S      PM-finalT0MIS_ACATAT_p16S       PM-finalTfC_GAGCTT_p16S PM-finalTfD_TCTCGG_p16PM-finalTfE_AGCGAC_p16S  PM-finalTfF_GCCAAG_p16S PM-finalTfG_AGGTTC_p16S
1       2007    1129    f5579f91a5ca1c9ce3fe1ffe992c3dda4c718025        392     385     N       8     0.000153316326531        ctccgtgccagcagccgcggtaatacgggggatgcaagcgttatccggaatcattgggcgtaaagcgcctgtaggttgtttaataagtctgttgttaaagactagggcttaaccctagaaaagcaatggaaactactagactagagtatggcaggggtagagggaatttctagtgtagcggtgaaatgcgtagatattagaaagaacaccggtggcgaaagcgctctactggaccattactgacactcagaggcgaaagctagggtagcaaaagggattagatacccctgtagtcctagccgtaaacgatggatactacatgttgtgcattatgtacagtatggtagctaacgcgttaagtatcccgcctggggagtacgctcgcaagggtg    100.0   Bacteria|Cyanobacteria|Oxyphotobacteria|Chloroplast|*   JQ197833.1.1304,JQ197512.1.1304,JQ196309.1.1304,JQ197835.1.1304,JQ199895.1.1304,JQ199480.1.1304,HM127811.1.1405,JX016414.1.1441,JX016490.1.1441,JX016561.1.1441,JX016673.1.1441,JX016514.1.1441,JX017009.1.1441,JX017095.1.1441,GQ346755.1.1323,GQ348584.1.1327,JX537806.1.1308,JX537897.1.1301,KC001577.1.1258,KC002381.1.1287,EF574706.1.1441,GU235484.1.1316     0       0       0       0       0       0       0       18      0       27      1     25       0       0       0       1925
2       1883    1190    f98930d400ae7efe86292c4c417c3cb61a3ce816        399     317     N       11    0.000154887218045        ctccgtgccagcagccgcggtaagacggaggatgcaagtgttattcggaatgattgggcgtaaagagtctgtaggccgtatagaaagtcttttgttaaatgcctcggctcaaccgagatccagcaaaggaaacttctatacttgagggaagtagaggtacagggaattcccggtggagcggtgaaatgcgtagatatcgggaggaacaccaatatggcgaaggcactgtactgggcttttcctgacgctgagagacgaaagctaaaggagtgattaggattagataccctagtaattttagccgtaaacgatggaaactcactgccgagcgaaatacaacgagcggtggtcaagctaacgcgtgaagtttcccgcctggggattacgcttgcaaaagtg     100.0   Bacteria|Cyanobacteria|Oxyphotobacteria|Chloroplast|*   JQ195238.1.1313,JQ195420.1.1313,JQ195905.1.1315,JQ195346.1.1313,JQ196571.1.1313,JQ196735.1.1313,JQ195734.1.1313,JQ197124.1.1313,JQ197306.1.1313,JQ197298.1.1316,JQ196440.1.1313,JQ197433.1.1313,JQ197434.1.1313,JQ197626.1.1313,JQ196721.1.1313,JQ197965.1.1313,JQ198009.1.1313,JQ198126.1.1313,JQ198209.1.1313,JQ198219.1.1313,JQ198172.1.1313,JQ197508.1.1313,JQ197620.1.1313,JQ197637.1.1313,JQ197478.1.1313,JQ198623.1.1313,JQ196081.1.1313,JQ197596.1.1313,JQ197876.1.1313,JQ197823.1.1313,JQ198855.1.1313,JQ197700.1.1313,JQ196285.1.1313,JQ197670.1.1313,JQ197791.1.1313,JQ198169.1.1313,JQ196667.1.1313,JQ198105.1.1313,JQ198314.1.1313,JQ198542.1.1313,JQ199324.1.1313,JQ198561.1.1313,JQ198536.1.1313,JQ198564.1.1313,JQ199623.1.1313,JQ199625.1.1314,JQ199695.1.1313,JQ197497.1.1313,JQ199779.1.1313,JQ199794.1.1313,JQ199926.1.1313,JQ199323.1.1313,JQ199927.1.1313,JQ199350.1.1313,JQ198208.1.1313,JQ198233.1.1313,JQ200223.1.1313,JQ198385.1.1313,JQ199746.1.1313,JQ199948.1.1313,JQ200203.1.1313,JQ200229.1.1313,JQ200084.1.1313,JQ200011.1.1313,JQ200105.1.1313,JQ200016.1.1313,JQ199460.1.1313,JQ199585.1.1314,FN563097.1.1453,AM747382.1.1283,JX016606.1.1451,JX017016.1.1451,JX017124.1.1451,JX537822.1.1341,JX537878.1.1320,KC545747.1.1423,KF596584.1.1451     0       0       0     409      0       726     193     0       1       220     68      59      76      0       0       121   7
3       1594    916     cb9c37c9c68b234a7f1150c813737a5123e77dc4        392     322     N       5     0.000154591836735        ctccgtgccagcagccgcggtaatacgggggatgcaagcgttatccggaatcattgggcgtaaagcgcctgtaggttgtttaataagtctgttgttaaagactagggctcaaccctagaaaagcaatggaaactactagactagagtatggcaggggtagagggaatttctagtgtagcggtgaaatgcgtagatattagaaagaacaccggtggcgaaagcgctctactggaccattactgacactcagaggcgaaagctagggtagcaaaagggattagatacccctgtagtcctagccgtaaacgatggatactaggtgttgtatttattttacagtatcgtagctaacgcgttaagtatcccgcctggggagtacgctcgcaagggtg    100.0   Bacteria|Cyanobacteria|Oxyphotobacteria|Chloroplast|uncultured_marine_eukaryote KX937536.1.1443,KX937534.1.1443,KX937535.1.1443        0       0       0       0       0       0       0     15       1       0       1449    89      40
4       976     592     18cf185a9728b2e4e0bf8479578313d312c4702b        397     182     N       6     0.000161712846348        ctccgtgccagcagccgcggtaagacggaggatgcaagtgttatccggaatcactgggcgtaaagcgtctgtaggtggtttaataagtcaactgttaaatcttgaggctcaacctcaaaatcgcagtcgaaactgttagactagagtatagtaggggtaaagggaatttccagtggagcggtgaaatgcgtagagattggaaggaacaccgatggcgaaggcactttactgggctattactaacactcagagacgaaagctagggtagcaaatgggattagataccccagtagtcctagctgtaaacaatggatactagatgttgaacagatcgacctgtgcagtatcaaagctaacgcgttaagtatcccgcctgggaagtatgctcgcaagagtg       99.5    Bacteria|Cyanobacteria|Oxyphotobacteria|Chloroplast|*   FM242284.1.1439,KU243249.1.1264,FJ002183.1.1431,KF771485.1.1443,JF272135.1.1444        1       0       0       0       0     791      6       166

Tax4Fun支持QIIME的不同输出格式。要导入我的数据并使用它,我必须将此OTU表转换为由QIIME(.qotu)生成的表。我的输出如下:

#OTU    PM-18S6ext_TATGCG_p16S  PM-18SA_TTTTTC_p16S     PM-18SB_GTCGTG_p16S     PM-18SD_GCTATC_p16S   PM-18SLAZ_TTTGTA_p16S    PM-18SMIS_AGCCTG_p16S   PM-18SQN_CAACAG_p16S    PM-finaMIS_CTCTCG_p16S  PM-finaTfA_TGCTGA_p16S PM-final6ext_GGTAGC_p16S        PM-finalA_AATATG_p16S   PM-finalB_TGAGCA_p16S   PM-finalD_AATCAC_p16S  PM-finalLAZ_GCAAAT_p16S PM-finalQN_AAATTG_p16S  PM-finalT041p_CTGTGC_p16S       PM-finalT0MIS_ACATAT_p16S      PM-finalTfC_GAGCTT_p16S PM-finalTfD_TCTCGG_p16S PM-finalTfE_AGCGAC_p16S PM-finalTfF_GCCAAG_p16S        PM-finalTfG_AGGTTC_p16S taxonomy*
1       0       0       0       0       0       0       0       18      0       27      1       0     25       0       0       0       1925    k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__*; g__; s__
2       0       0       0       0       0       0       0       409     0       726     193     0    220      68      59      76      0       0       121     3       7       k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__*; g__; s__
3       0       0       0       0       0       0       0       0       0       0       0       0    15       1       0       1449    89      40      k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__uncultured_marine_eukaryote; g__; s__
4       1       0       0       0       0       0       0       0       0       0       0       0    791      6       166     k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__*; g__; s__

之后,我将我的QIIME OTU表转换为biom文件,并按如下方式添加元数据:

biom convert -i "${FILTERED_QOTU}" -o $FILTERED_BIOM --table-type="OTU table" --process-obs-metadata taxonomy --to-json
biom add-metadata -i $FILTERED_BIOM -o $FILTERED_BIOM_ANNOTATED --sample-metadata-fp "../tags_list.txt" --output-as-json

我有这个biom文件:

{"id": "None","format": "Biological Observation Matrix 1.0.0","format_url": "http://biom-format.org","type": "OTU table","generated_by": "BIOM-Format 1.3.1","date": "2018-02-27T16:18:17.853139","matrix_type": "sparse","matrix_element_type": "float","shape": [26, 22],"data": [[0,7,18.0],[0,9,27.0],[0,10,1.0],[0,14,4.0],[0,15,3.0],[0,16,4.0],[0,17,25.0],[0,21,1925.0],[1,7,409.0],[1,9,726.0],[1,10,193.0],[1,12,1.0],[1,13,220.0],[1,14,68.0],[1,15,59.0],[1,16,76.0],[1,19,121.0],[1,20,3.0],[1,21,7.0],[2,16,15.0],[2,17,1.0],[2,19,1449.0],[2,20,89.0],[2,21,40.0],[3,0,1.0],[3,16,7.0],[3,17,5.0],[3,19,791.0],[3,20,6.0],[3,21,166.0],[4,7,8.0],[4,9,25.0],[4,15,13.0],[4,16,18.0],[4,17,1.0],[4,19,275.0],[4,20,44.0],[4,21,48.0],[5,7,40.0],[5,9,56.0],[5,10,12.0],[5,13,12.0],[5,14,4.0],[5,15,7.0],[5,16,13.0],[5,19,226.0],[5,20,5.0],[5,21,6.0],[6,7,24.0],[6,9,147.0],[6,10,29.0],[6,13,12.0],[6,14,8.0],[6,15,3.0],[6,19,2.0],[6,20,1.0],[6,21,8.0],[7,7,18.0],[7,9,35.0],[7,10,30.0],[7,13,11.0],[7,14,26.0],[7,15,5.0],[7,16,3.0],[7,19,10.0],[7,20,8.0],[7,21,7.0],[8,16,2.0],[8,17,1.0],[8,19,85.0],[8,20,1.0],[8,21,45.0],[9,17,2.0],[9,20,13.0],[9,21,79.0],[10,7,1.0],[10,9,11.0],[10,13,34.0],[10,14,1.0],[11,7,1.0],[11,13,21.0],[11,15,2.0],[11,16,15.0],[12,9,4.0],[12,10,1.0],[12,14,1.0],[12,15,16.0],[12,16,8.0],[13,19,5.0],[13,20,15.0],[13,21,6.0],[14,19,12.0],[14,20,8.0],[14,21,1.0],[15,8,1.0],[15,20,20.0],[16,15,3.0],[16,16,1.0],[16,20,14.0],[17,19,11.0],[17,20,4.0],[18,9,3.0],[18,13,3.0],[18,15,2.0],[18,16,7.0],[19,7,2.0],[19,9,5.0],[19,14,5.0],[20,19,10.0],[20,20,1.0],[21,19,4.0],[21,20,4.0],[21,21,1.0],[22,19,7.0],[22,21,1.0],[23,7,5.0],[23,15,1.0],[23,16,1.0],[24,21,7.0],[25,19,5.0]],"rows": [{"id": "1", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "2", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "3", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__uncultured_marine_eukaryote", "g__", "s__"]}},{"id": "4", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "6", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "7", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "8", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "9", "metadata": {"taxonomy": ["k__N", "p__o", "c___", "o__h", "f__i", "g__t", "s__"]}},{"id": "10", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "14", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "17", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "18", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "19", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "21", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "24", "metadata": {"taxonomy": ["k__Bacteria", "p__Bacteroidetes", "c__Bacteroidia", "o__Flavobacteriales", "f__Flavobacteriaceae", "g__Pseudofulvibacter", "s__*"]}},{"id": "25", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Synechococcales", "f__Cyanobiaceae", "g__*", "s__*"]}},{"id": "26", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__uncultured_bacterium", "g__", "s__"]}},{"id": "28", "metadata": {"taxonomy": ["k__Bacteria", "p__Proteobacteria", "c__Gammaproteobacteria", "o__Betaproteobacteriales", "f__Burkholderiaceae", "g__Limnobacter", "s__*"]}},{"id": "29", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "32", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "33", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "37", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "40", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "41", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "42", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "51", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__uncultured_marine_eukaryote", "g__", "s__"]}}],"columns": [{"id": "PM-18S6ext_TATGCG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "TATGCG"}},{"id": "PM-18SA_TTTTTC_p16S", "metadata": {"DOB": "", "BarcodeSequence": "TTTTTC"}},{"id": "PM-18SB_GTCGTG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "GTCGTG"}},{"id": "PM-18SD_GCTATC_p16S", "metadata": {"DOB": "", "BarcodeSequence": "GCTATC"}},{"id": "PM-18SLAZ_TTTGTA_p16S", "metadata": {"DOB": "", "BarcodeSequence": "TTTGTA"}},{"id": "PM-18SMIS_AGCCTG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "AGCCTG"}},{"id": "PM-18SQN_CAACAG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "CAACAG"}},{"id": "PM-finaMIS_CTCTCG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "CTCTCG"}},{"id": "PM-finaTfA_TGCTGA_p16S", "metadata": {"DOB": "", "BarcodeSequence": "TGCTGA"}},{"id": "PM-final6ext_GGTAGC_p16S", "metadata": {"DOB": "", "BarcodeSequence": "GGTAGC"}},{"id": "PM-finalA_AATATG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "AATATG"}},{"id": "PM-finalB_TGAGCA_p16S", "metadata": {"DOB": "", "BarcodeSequence": "TGAGCA"}},{"id": "PM-finalD_AATCAC_p16S", "metadata": {"DOB": "", "BarcodeSequence": "AATCAC"}},{"id": "PM-finalLAZ_GCAAAT_p16S", "metadata": {"DOB": "", "BarcodeSequence": "GCAAAT"}},{"id": "PM-finalQN_AAATTG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "AAATTG"}},{"id": "PM-finalT041p_CTGTGC_p16S", "metadata": {"DOB": "", "BarcodeSequence": "CTGTGC"}},{"id": "PM-finalT0MIS_ACATAT_p16S", "metadata": {"DOB": "", "BarcodeSequence": "ACATAT"}},{"id": "PM-finalTfC_GAGCTT_p16S", "metadata": {"DOB": "", "BarcodeSequence": "GAGCTT"}},{"id": "PM-finalTfD_TCTCGG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "TCTCGG"}},{"id": "PM-finalTfE_AGCGAC_p16S", "metadata": {"DOB": "", "BarcodeSequence": "AGCGAC"}},{"id": "PM-finalTfF_GCCAAG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "GCCAAG"}},{"id": "PM-finalTfG_AGGTTC_p16S", "metadata": {"DOB": "", "BarcodeSequence": "AGGTTC"}}]}

然后将biom文件与Tax4Fun一起使用,但是,我收到以下错误消息:

biom16S = importQIIMEBiomData('all_p16S.OTU.filtered.annotated.biom')
Error in rowsum.default(as.matrix(taxProfile), ModSilvaIds) : 
  incorrect length for 'group'

我尝试了另一种方式:我以QIIME生成的txt格式转换OTU表并使用Tax4Fun数据,但我再次收到错误:

biom convert --to-tsv -i "all_p16S.OTU.filtered.annotated.biom" -o "filtered.annotated.txt" --table-type="OTU table" --header-key taxonomy
folderReferenceData = "SILVA119"
biom16S = importQIIMEData("filtered.annotated.txt")
Tax4FunOutput = Tax4Fun(biom16S, folderReferenceData) 
Error in `rownames<-`(`*tmp*`, value = c("PM-18S6ext_TATGCG_p16S", "PM-18SA_TTTTTC_p16S",  : 
  length of 'dimnames' [1] not equal to array extent

txt文件如下所示:

# Constructed from biom file
#OTU ID PM-18S6ext_TATGCG_p16S  PM-18SA_TTTTTC_p16S     PM-18SB_GTCGTG_p16S     
PM-18SD_GCTATC_p16S  PM-18SLAZ_TTTGTA_p16S    PM-18SMIS_AGCCTG_p16S   PM-18SQN_CAACAG_p16S    PM-finaMIS_CTCTCG_p16S  PM-finaTfA_TGCTGA_p16S        PM-final6ext_GGTAGC_p16S        PM-finalA_AATATG_p16S   PM-finalB_TGAGCA_p16SPM-finalD_AATCAC_p16S    PM-finalLAZ_GCAAAT_p16S PM-finalQN_AAATTG_p16S  PM-finalT041p_CTGTGC_p16S    PM-finalT0MIS_ACATAT_p16S        PM-finalTfC_GAGCTT_p16S PM-finalTfD_TCTCGG_p16S PM-finalTfE_AGCGAC_p16S       PM-finalTfF_GCCAAG_p16S PM-finalTfG_AGGTTC_p16S taxonomy
1       0.0     0.0     0.0     0.0     0.0     0.0     0.0     18.0    0.0     27.0    1.0     0.0  0.0      0.0     4.0     3.0     4.0     25.0    0.0     0.0     0.0     1925.0  k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__*; g__; s__
2       0.0     0.0     0.0     0.0     0.0     0.0     0.0     409.0   0.0     726.0   193.0   0.0  1.0      220.0   68.0    59.0    76.0    0.0     0.0     121.0   3.0     7.0     k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__*; g__; s__
3       0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0  0.0      0.0     0.0     0.0     15.0    1.0     0.0     1449.0  89.0    40.0    k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__uncultured_marine_eukaryote; g__; s__
4       1.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0  0.0      0.0     0.0     0.0     7.0     5.0     0.0     791.0   6.0     166.0   k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__*; g__; s__

我搜索过Google,问题似乎是分类列的格式。但是,当我检查这个时,它看起来不错:

x = read_biom('all_p16S.OTU.filtered.annotated.biom')
head(biom_data(x))
6 x 22 sparse Matrix of class "dgCMatrix"
   [[ suppressing 22 column names ‘PM-18S6ext_TATGCG_p16S’, ‘PM-18SA_TTTTTC_p16S’, ‘PM-18SB_GTCGTG_p16S’ ... ]]

1 . . . . . . .  18 .  27   1 . .   .  4  3  4 25 .    .  .
2 . . . . . . . 409 . 726 193 . 1 220 68 59 76  . .  121  3
3 . . . . . . .   . .   .   . . .   .  .  . 15  1 . 1449 89
4 1 . . . . . .   . .   .   . . .   .  .  .  7  5 .  791  6
6 . . . . . . .   8 .  25   . . .   .  . 13 18  1 .  275 44
7 . . . . . . .  40 .  56  12 . .  12  4  7 13  . .  226  5
taxmat = as.matrix(observation_metadata(x), rownames.force=TRUE)
taxmat
   taxonomy1     taxonomy2           taxonomy3               
1  "k__Bacteria" "p__Cyanobacteria"  "c__Oxyphotobacteria"   
2  "k__Bacteria" "p__Cyanobacteria"  "c__Oxyphotobacteria"   
3  "k__Bacteria" "p__Cyanobacteria"  "c__Oxyphotobacteria"   
4  "k__Bacteria" "p__Cyanobacteria"  "c__Oxyphotobacteria"   
6  "k__Bacteria" "p__Cyanobacteria"  "c__Oxyphotobacteria"   
7  "k__Bacteria" "p__Cyanobacteria"  "c__Oxyphotobacteria"

欢迎所有建议!

1 个答案:

答案 0 :(得分:0)

当我开始使用Tax4Fun时,我遇到了类似的问题。通过侦查网络,我发现(对我有用)的解决方案是改变分类法的编写方式。也就是说,删除“ k ___”,“ p___”等。分类法的示例应为“细菌,蓝细菌,氧化细菌”。此外,我读到它在使用TSV格式而不是BIOM格式时效果更好。

如果您需要更多帮助,我有用来推断Tax4Fun函数的代码

最好!