Question

我有一个csv文件，我在使用cut命令后改变了它的一些列。

123;bbb ;10.01.2010
456;ddd;11.01.2015
789;aaa;20.12.2010
222;ccc;15.10.2010

作为一个例子，我得到第二列，修剪并对下面的代码进行排序;

cut -f 2 -d ';' data.csv | sed 's/^[ \t]*//;s/[ \t]*$//' | sort

如何使用新值覆盖文件列，以便同一文件变为如下所示？

123;aaa;10.01.2010
456;bbb;11.01.2015
789;ccc;20.12.2010
222;ddd;15.10.2010

Answer 1

<强>输入

$ cat f
123;bbb ;10.01.2010
456;ddd;11.01.2015
789;aaa;20.12.2010
222;ccc;15.10.2010

使用cut, tr, sort and paste

$ paste -d ';' <(cut -f 1 -d ';'  f) <(cut -f 2 -d ';'  f | tr -d ' ' | sort) <(cut -f 3 -d ';'  f | sort)
123;aaa;10.01.2010
456;bbb;11.01.2015
789;ccc;15.10.2010 
222;ddd;20.12.2010

使用cut, tr, sort and pr

$ pr -mtJs';' <(cut -f 1 -d ';'  f) <(cut -f 2 -d ';'  f | tr -d ' ' | sort) <(cut -f 3 -d ';'  f | sort)
123;aaa;10.01.2010
456;bbb;11.01.2015
789;ccc;15.10.2010
222;ddd;20.12.2010

使用gawk（推荐使用）

$ awk  'BEGIN{FS=OFS=";"}FNR==NR{sub(/ +/,"",$2);a[$2];next}FNR==1{asorti(a,b)}{$2=b[FNR]}1' f f
123;aaa;10.01.2010
456;bbb;11.01.2015
789;ccc;20.12.2010
222;ddd;15.10.2010

说明（两次读取同一文件）

awk  '# START SCRIPT

      BEGIN{
            FS=OFS=";"          # Set input and output field separator
      }  

      # IF the number of records read so far across all files is equal
      # to the number of records read so far in the current file, a
      # condition which can only be true for the first file read, THEN 

      FNR==NR{     
           # Trim space char of field2             
           sub(/ +/,"",$2)

           # populate array "a" such that the value indexed by the field2
           a[$2]

           # Move on to the next record so we do not do any processing intended
           # for records from the second file. 
           next
      }
      # When we read first record of same file read second time then
      FNR==1{
           # asorti() sorts based on keys (or indexes, or indices, hence the "i")
           asorti(a,b)
      }
      {  
         # replace field to value with array value
         $2=b[FNR]

      }1    # }1 at the end does default operation print $0

   ' f f    # input same file twice

Shell脚本在切割后覆盖列

1 个答案: