我有一个文件,每行有74列。如果第一列和第二列匹配,我一直在尝试组合这些行。 该文件如下所示。
CHECK_IN|2000000000|MS|XXXX|XXXX|N|34|N|N|N|N|N|Y|N|N|N|N|N|123456|aaaaaa|122333|||||||||||AAAAAA|BBBBBBB|CCCCCCC|||||||||||||||||||1000123|aaaa|N|qwerty||REGISTERED|REGISTERED|REGISTERED|UNREGISTERED|19-05-2015|Video|EDM||||||||||xxxxx
CHECK_IN|2000000000|MS|XXXX|XXXX|N|34|N|Y|N|N|N|N|N|N|N|N|N|345676|Abcgdwejj|aaaaaaa||||||||||||||||||||||||NNNNNNN||||||||1000001|cccccc|N|qyuirt||REGISTERED|REGISTERED|REGISTERED|UNREGISTERED|19-05-2015|Video|EDM||||||||||xxxxx
我使用了以下脚本:
cat sample_file4.txt | awk -F "|" '{line="";
for(i = 3; i <= NF ;i++)
line = line $i"|";
table[$1"|"$2]=table[$1"|"$2]"|"line;}
END { for (key in table) print key "==>" table[key];}' > output9.txt
记录未附加到第一行。除了键值,同一行正在重复。如下所示
1.CHECK_IN|2000000000==>|MS|XXXX|XXXX|N|34|N|N|N|N|N|Y|N|N|N|N|N|123456|aaaaaa|122333|||||||||||AAAAAA|BBBBBBB|CCCCCCC|||||||||||||||||||1000123|aaaa|N|qwerty||REGISTERED|REGISTERED|REGISTERED|UNREGISTERED|19-05-2015|Video|EDM||||||||||xxxxx
2.||MS|XXXX|XXXX|N|34|N|Y|N|N|N|N|N|N|N|N|N|345676|Abcgdwejj|aaaaaaa||||||||||||||||||||||||NNNNNNN||||||||1000001|cccccc|N|qyuirt||REGISTERED|REGISTERED|REGISTERED|UNREGISTERED|19-05-2015|Video|EDM||||||||||xxxxx
请帮我把它们放到一条线上。
答案 0 :(得分:1)
我会这样写:
awk '
BEGIN {FS = OFS = "|"}
{ key = $1 SUBSEP $2 }
!(key in lines) {lines[key]=$0; next}
{$1=$2=""; line=$0; sub(/^../, "", line); lines[key] = lines[key] FS line}
END {for (key in lines) {print lines[key]}}
' file