Question

即使我以前一直使用grep，uniq和sort，但我还是不太了解如何解决问题。我将不胜感激如何破解：）

我想从输入文件中获取前6个字符中的uniq，并获得如下所示的输出。我不知道是否需要使用uniq，grep，awk，也许有人可以帮我这个忙。

我的文件如下：

Field1     Filed2    Field3
value1   some_stuff  something
value2   another     fake  
value1   fake        value    
value3   blah        blah
value2   blah        fake 


Prefered output:

Field1    Field2    Field3
value1   some_stuff something
value2   another    fake
value3   blah       blah

Answer 1

可以请您尝试一下，

awk 'FNR==1{print;next} !a[substr($0,1,6)]++' Input_file

说明： 添加上述代码的说明。

awk '
FNR==1{                     ##Checking condition if line is first then do following.
  print                     ##Printing current line which is first line of headers.
  next                      ##next will skip all further lines from here.
}                           ##Closing condition BLOCK here.
!a[substr($0,1,6)]++        ##Creating array named a whose index is first 6 characters and keeping its increment value.
                            ##awk works on function condition/pattern and action, no action mentioned here so print of line happened.
'  Input_file               ##Mentioning Input_file name here.

如果您的第一个字段只有6个字符，请使用以下字符。

awk '!a[$1]++' Input_file

关于!a[$1]++部分。基本上，它会检查它是否已经将上一行分析中的第一列值存储在数组中（此处称为x）。

如果是这种情况（a[$1] != 0），则不会输出该行。否则，它将输出并存储它（a[$1]++，因此a[$1] = a[$1] + 1因此a[$1]将等于1）以进行下一行解析。参见此Unix answer。

从前N个字符获取唯一值

1 个答案: