我必须要合并多个文件。
这是两个文件。
的1.txt
Allele Sequence
B*07:02:01 ABCDE
B*07:33:01 ABCD
B*07:41 AB
2.txt
Allele Sequence
B*07:02:01 FGHIJ
B*07:33:01 EFGH
B*07:41 CD
分隔符是Tab(\ t)
我希望得到像
这样的结果B*07:02:01 ABCDEFGHIJ
B*07:33:01 ABCDEFGH
B*07:41 ABCD
我试过以下。
awk -F"\t" '
{key = $1}
FNR==NR {line[key]=$0; next}
key in line {print line[$1], $2}
' $1 $2 > output_2.txt
然后结果如
Allele Sequence^M Sequence^M
B*07:02:01 ABCDE^M FGHIJ
B*07:33:01 ABCD^M EFGH
B*07:41 AB^M CD
如何更清晰,更准确地说出我想要的内容
谢谢!
答案 0 :(得分:2)
这可能有效:
awk 'FNR==NR {a[$1]=$2;next} FNR>1{print $0 a[$1]} ' 2.txt 1.txt
B*07:02:01 ABCDEFGHIJ
B*07:33:01 ABCDEFGH
B*07:41 ABCD
工作原理:
awk '
FNR==NR { # For first file only (2.txt)
a[$1]=$2 # Read data in to array a using $1 as key and $2 as value
next} # Skip to next record
FNR>1{ # Skip first record of second file (1.txt)
print $0 a[$1]} # Print complete record from 1.txt, and data from array using $1 as key
' 2.txt 1.txt # read the files
答案 1 :(得分:0)
awk -F"\t" '
{gsub("\r",""); key = $1}
FNR==NR {line[key]=$0; next}
key in line {print line[$1]$2}
' 1d.txt 2d.txt > x
gsub
将删除您遇到问题的“^ M”(它是chr(13)= CR,因为您的输入文件来自使用CRLF的DOS / Windows世界,然后您正在处理UN * X,仅使用LF作为行分隔符)line[$1]$2
中的“,”将删除空格答案 2 :(得分:0)
join 1.txt 2.txt | awk '{print $1, $2 $3}'