我的一些fastq读取有问题:
@1V3F_10526394 M01994:35:000000000-BM49D:1:1106:17684:21227 1:N:0:1 orig_bc=GGAATCTCTATAGCCT new_bc=GGAATCTCTATAGCCT bc_diffs=0
+
CGTACACTCCTGCGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGCAAGCCTGATGCAGCGACGCCGCGTGCGGGATGACGGCCTTCGGGTTGTAAACCGCTTTTGATCGGGAGCAAGCCTTCGGGTGAGTGTACCTTTCGAATAAGCACCGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGGTGCAAGCGTTATCCGGAATTATTGGGCGTAAAGGGCTCGTAGGCGGTTCGTCGCGTCCGGTGTGAAAGTCCATCGCTTAACGGTGGATCCGCGCCGGGTACGGGCGGGCTTGAGTGCGGTAGGGGAGACTGGAATTCCCGGTGTAACGGTGGAATGTGTAGATATCGGGAAGAACACCAATGGCGAAGGCAGGTCTCTGGGCCGTTACTGACGCTGAGGAGCGAAAGCGTGGGGAGCGAACAGGATTAGATACCCCTGTAGTCCC
+
CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGDGGGDGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGDGGCFGGGGGGGGGDGGGFGGGGGGDGGGGGGGGGGGGGCFG@CFGFGCFFGGGFGGFDFGGDGGGEFCGGCFGGGFGGGGGGDGGGGGFGGGGGGGGGGGDGGGGGGGFGDFFGGGGGGGGGGGGGGGGDECGGF7EEGGGGGFGGGGGGGGGGGGGFCGGGGEEGGGEEGGGGGGGF@CEGGGGGGGGGGGGGGGGGGGGFBGDGGGGFDGGGGGCGDGGGGGFGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGEGGGGGGGGFCGGGGGGDGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGEGGFGGGGGGGGGGGGGGGGGGGGGGCCCCC
第一个“+”是个问题。我该如何删除它?
N.B。不是所有的读取都有这个问题所以我不能删除符号“@”之后的每一行,因为我已经尝试过..
答案 0 :(得分:1)
也许您可以尝试删除所有“ +”行,然后每3添加一个“ +”行。这可能比尝试识别“ +”是否正确放置要容易得多。
cat file.fastq | sed '/^+$/d' | awk '{print; if (NR%3==2){print "+"}}' > fixed.fastq