我对此很陌生,并且意识到已经发布了类似的问题,但是我无法完全弄清他们的需求。我有两个文件。
文件1:
1: Read 1
2: Sequence 1
3: +
4: Quality 1
5: Read 2
6: Sequence 2
7: +
8: Quality 2
...
文件2:
1: Sequence 1 edited
2: Sequence 2 edited
3: Sequence 3 edited
4: Sequence 4 edited
...
从第一个文件的第2行开始,我需要用第二个文件中的下一个读取行替换每第4行,因此结果如下所示:
1: Read 1
2: Sequence 1 edited
3: +
4: Quality 1
5: Read 2
6: Sequence 2 edited
7: +
8: Quality 2
...
到目前为止,我一直在使用此代码,该代码似乎有效,但是作为命令速度很慢,而作为shell脚本则很麻烦:
Counter=2
while read p; do echo $Counter; echo $p;
sed -i~ "${Counter}s/^.*/$p/" file 1;
Counter=$((Counter+4)); done < file 2
我认为我应该可以使用awk做到这一点,但是我不确定如何做到。任何帮助或改进将不胜感激!
答案 0 :(得分:0)
假设行号仅用于说明目的, 不包含在文件中,请尝试以下操作:
awk 'NR==FNR {line[NR]=$0; next} {if (FNR%4==2) $0=line[++count]; print}' file2 file1
输出:
Read 1
Sequence 1 edited
+
Quality 1
Read 2
Sequence 2 edited
+
Quality 2
...
[说明]
NR==FNR
仅在读取file2
时匹配,并且存储
line
中的行按顺序排列。{if ...
时才执行以下file1
语句。
如果file1的行号等于2,模数为4,则该行
被数组line
的内容替换。答案 1 :(得分:0)
使用awk和粘贴的另一种解决方案
awk ' { print "\n" $0 "\n" "\n" } ' file2.txt |
paste - file1.txt | awk -F"\t" ' {x=NR%4==2 ? $1 : $2; print x } '
具有给定的输入
$ cat cmswen1.txt
Read 1
Sequence 1
+
Quality 1
Read 2
Sequence 2
+
Quality 2
$ cat cmswen2.txt
Sequence 1 edited
Sequence 2 edited
Sequence 3 edited
Sequence 4 edited
$ awk ' { print "\n" $0 "\n" "\n" } ' cmswen2.txt |
paste - cmswen1.txt | awk -F"\t" ' {x=NR%4==2 ? $1 : $2; print x } '
Read 1
Sequence 1 edited
+
Quality 1
Read 2
Sequence 2 edited
+
Quality 2
Sequence 3 edited
Sequence 4 edited
$