获取包含以下行的文本文件:
/user$ cat ORIGFILE
se832p41iEC.200289_EDI832I140401232506.txt
pt832p41iEC.213631_EDI832I140401232501.txt
xe832p41iEC.201687_EDI832I140401232512.txt
pt832p41iEC.213632_EDI832I140401232502.txt
se832p41iEC.200289_EDI832I140401232508.txt
se832p41iEC.200289_EDI832I140401232507.txt
xe832p41iEC.201687_EDI832I140401232513.txt
xe832p41iEC.201687_EDI832I140401232511.txt
如果有重复的会话号(例如200289),它应该将每个重复部分输出到一个文件并显示如下:
/user$ cat se832p41iEC.200289
se832p41iEC.200289_EDI832I140401232506.txt
se832p41iEC.200289_EDI832I140401232507.txt
se832p41iEC.200289_EDI832I140401232508.txt
/user$ cat xe832p41iEC.201687
xe832p41iEC.201687_EDI832I140401232511.txt
xe832p41iEC.201687_EDI832I140401232512.txt
xe832p41iEC.201687_EDI832I140401232513.txt
/user$ cat NEWFILE
pt832p41iEC.213631_EDI832I140401232501.txt
pt832p41iEC.213632_EDI832I140401232502.txt
提前谢谢。
更新:在@ Jaypal的提示(感谢男人)后想出来:
First - sort ORIGFILE| uniq -u > NEWFILE
Second - sort ORIGFILE | uniq -D > AWKFILE
Last - awk -F_ '{print $0 > $1}' AWKFILE
答案 0 :(得分:1)
现在您已添加了尝试,以下是使用awk
:
$ ls
file
$ cat file
se832p41iEC.200289_EDI832I140401232506.txt
pt832p41iEC.213631_EDI832I140401232501.txt
xe832p41iEC.201687_EDI832I140401232512.txt
pt832p41iEC.213632_EDI832I140401232502.txt
se832p41iEC.200289_EDI832I140401232508.txt
se832p41iEC.200289_EDI832I140401232507.txt
xe832p41iEC.201687_EDI832I140401232513.txt
xe832p41iEC.201687_EDI832I140401232511.txt
$ awk -F_ '{
a[$1] = (a[$1] ? a[$1] RS $0 : $0)
b[$1]++
}
END {
for(x in a) print a[x] > (b[x]>1 ? x : "NEWFILE")
}' file
$ ls
NEWFILE file se832p41iEC.200289 xe832p41iEC.201687
$ head *
==> NEWFILE <==
pt832p41iEC.213631_EDI832I140401232501.txt
pt832p41iEC.213632_EDI832I140401232502.txt
==> file <==
se832p41iEC.200289_EDI832I140401232506.txt
pt832p41iEC.213631_EDI832I140401232501.txt
xe832p41iEC.201687_EDI832I140401232512.txt
pt832p41iEC.213632_EDI832I140401232502.txt
se832p41iEC.200289_EDI832I140401232508.txt
se832p41iEC.200289_EDI832I140401232507.txt
xe832p41iEC.201687_EDI832I140401232513.txt
xe832p41iEC.201687_EDI832I140401232511.txt
==> se832p41iEC.200289 <==
se832p41iEC.200289_EDI832I140401232506.txt
se832p41iEC.200289_EDI832I140401232508.txt
se832p41iEC.200289_EDI832I140401232507.txt
==> xe832p41iEC.201687 <==
xe832p41iEC.201687_EDI832I140401232512.txt
xe832p41iEC.201687_EDI832I140401232513.txt
xe832p41iEC.201687_EDI832I140401232511.txt