我想删除除起始字符串之外的整行。我的文件看起来像:
CAM_READ_0623233313 / library_id = CAM_LIB_002149 / sample_id = CAM_SMPL_003380 raw_id = G9ALM7U02GRHFF长度= 72 / IP_notice =?从CAMERA下载的遗传信息可被视为丹麦遗传遗传的一部分,丹麦是从中获得样本的国家。这些信息的用户同意:1)承认丹麦是提供遗传信息的任何国家的原产国,2)如果他们打算使用,请联系CBD网站(http://www.cbd.int/countries/)上确定的CBD联络点用于商业目的的遗传信息。 AGGTAGTTTCCTCTACAGACTCTGCTATTTTCATCCGTGCGTCTTCGCGGCCGGTCCAGAGCGCGCCCCACG
我的最终输出应该是:
CAM_READ_0623233313 AGGTAGTTTCCTCTACAGACTCTGCTATTTTCATCCGTGCGTCTTCGCGGCCGGTCCAGAGCGCGCCCCACG
如何使用sed命令执行此操作? 中间没有换行符,格式相同。 每当我尝试将所有线条合并为单个线条时。
答案 0 :(得分:0)
$ cat data
CAM_READ_0623233313 /library_id=CAM_LIB_002149 /sample_id=CAM_SMPL_003380 raw_id=G9ALM7U02GRHFF length=72 /IP_notice=?This genetic information downloaded from CAMERA may be considered to be part of the genetic patrimony of Denmark, the country from which the sample was obtained. Users of this information agree to: 1) acknowledge Denmark as the country of origin in any country where the genetic information is presented and 2) contact the CBD focal point identified on the CBD website (http://www.cbd.int/countries/) if they intend to use the genetic information for commercial purposes.? AGGTAGTTTCCTCTACAGACTCTGCTATTTTCATCCGTGCGTCTTCGCGGCCGGTCCAGAGCGCGCCCCACG
$ sed -r 's/^(\w+).*\?(\s*\w+)$/\1\2/g' data
CAM_READ_0623233313 AGGTAGTTTCCTCTACAGACTCTGCTATTTTCATCCGTGCGTCTTCGCGGCCGGTCCAGAGCGCGCCCCACG
$
答案 1 :(得分:0)
Store
打印第一个和最后一个字段。