如何在一个相应的列上连接两个单独的文件?

时间:2017-12-22 14:29:24

标签: xml linux bash sorting join

我试图在一列上连接两个文件,但是join和sort命令给了我以下输出:

join: file 1 is not in sorted order

文件1:

TEST->Infrastructure->Global Windows Server, OI-QASDWDASDWQWD,
TEST->Infrastructure->Global Windows Server, OI-WASDWDASDWWWW,
TEST->Infrastructure->zSeries_MVS, REGAA638G0K,
TEST->Infrastructure->zSeries_MVS, REGAA55410K,

文件2:

SERVER1; Deployed; REGAA638G0K;
SERVER2; Deployed; OI-WASDWDASDWWWW;
SERVER3; Delete; OI-QASDWDASDWQWD;
SERVER4; Delete; REGAA55410K;

预期文件3:

SERVER1; Deployed; TEST->Infrastructure->zSeries_MVS;
SERVER2; Deployed; TEST->Infrastructure->Global Windows Server;
SERVER3; Delete; TEST->Infrastructure->Global Windows Server;
SERVER4; Delete; TEST->Infrastructure->zSeries_MVS;

我的命令:

join -1 2 -2 3 -o 1.1,2.1,2.2 <(sort -t"," -k2 spmGroupsModifiedSCLine.out) <(sort -t";" -k3 spmCompStatJoined.out)

第一个文件中的第二列和第二个文件中的第三列是相同的,因此我尝试加入它并首先对其进行排序。你看到其他方式加入吗?谢谢!

3 个答案:

答案 0 :(得分:0)

Awk 解决方案:

awk 'NR==FNR{ a[$2]=$1; next }$3 in a{ print $1,$2,a[$3] }' FS=',' file1 FS=';' OFS='; ' file2

输出:

SERVER1;  Deployed; TEST->Infrastructure->zSeries_MVS
SERVER2;  Deployed; TEST->Infrastructure->Global Windows Server
SERVER3;  Delete; TEST->Infrastructure->Global Windows Server
SERVER4;  Delete; TEST->Infrastructure->zSeries_MVS

答案 1 :(得分:0)

不如awk解决方案那么优雅,但可能更直观:

cat file2 | while read line; do
  key=$(cut -d';' -f3 <<< $line)
  echo "$(cut -d';' -f1-2 <<< $line); $(grep $key file1 | cut -d',' -f1);" >> file3
done

cat file2 | while read line; do key=$(cut -d';' -f3 <<< $line); echo "$(cut -d';' -f1-2 <<< $line); $(grep $key file1 | cut -d',' -f1);"; done > file3

答案 2 :(得分:0)

如果您想使用加入。

join -t ';' -1 2 -2 3 -o 2.1,2.2,1.1 <(sort -t , -k 2 File\ 1 | tr ',' ';') <(sort -t ';' -k 3 File\ 2) | sort > File\ 3