我有两个文件并尝试比较两个文件并打印特定值,但某处我遗漏了一些东西。试着纠正我的错误。
档案1
Saureus08BA02176_01020 NA
Saureus08BA02176_02495 COG1510 K
Saureus08BA02176_02020 COG1854 T
Saureus08BA02176_01302 COG3763 S
Saureus08BA02176_01834 COG0744 M
Saureus08BA02176_01131 NA
Saureus08BA02176_02481 COG0579 R
文件2
Saureus08BA02176_01381 1.00000
Saureus08BA02176_00001 1.00000
Saureus08BA02176_01020 324.08332
Saureus08BA02176_01131 999.00000
Saureus08BA02176_02481 4.07781
必需的输出
Saureus08BA02176_01020 NA 324.08332
Saureus08BA02176_02495 COG1510 K NA
Saureus08BA02176_02020 COG1854 T NA
Saureus08BA02176_01302 COG3763 S NA
Saureus08BA02176_01834 COG0744 M NA
Saureus08BA02176_01131 NA 999.000
Saureus08BA02176_02481 COG0579 R 4.07781
命令:
awk 'FNR==NR{a[$1]=$2;next}{print $0,a[$1]?a[$2]:"NA"}' file2 file1 > test1
它没有打印文件2的$ 2值。我错在哪里?
答案 0 :(得分:3)
a[$1]?a[$2]:"NA"
^
your array has no element, with index being second field of file1
如果a[$1]?a[$2]:"NA"
有a[$1]?a[$1]:"NA"
,那么(index_key in array)
至少有效,但不好的做法更好地使用(($1 in a)?a[$1]:"NA")
,因此它变为awk 'FNR==NR{a[$1]=$2;next}{print $0, (($1 in a)?a[$1]:"NA") }' file2 file1
{{1}}
答案 1 :(得分:0)
如果记录顺序不重要,请使用 加入 方法:
from sklearn.manifold import MDS
mds = MDS(n_components=2, dissimilarity="precomputed", random_state=1)
pos = mds.fit_transform(dist)
xs, ys = pos[:, 0], pos[:, 1]
names = [name for name in labels]
# Define the plot
for x, y, name in zip(xs, ys, names):
plt.scatter(x, y, color=color)
plt.text(x, y, name)
plt.show()
输出:
join -a1 -e "NA" <(sort file1) <(sort file2)