Question

我有一个文件，内容如下：

onelab2.warsaw.rd.tp.pl    5
onelab3.warsaw.rd.tp.pl    5
lefthand.eecs.harvard.edu  7
righthand.eecs.harvard.edu 7
planetlab2.netlab.uky.edu  8
planet1.scs.cs.nyu.edu     9
planetx.scs.cs.nyu.edu     9

所以对于每一行，有一个数字我想要每个数字的第一行所以上面的内容，我想得到：

onelab2.warsaw.rd.tp.pl    5
lefthand.eecs.harvard.edu  7
planetlab2.netlab.uky.edu  8
planet1.scs.cs.nyu.edu     9

我怎样才能做到这一点？我希望使用awk，sed等shell脚本。

Answer 1

这可能适合你（GNU排序）：

sort -nsuk2 file

对-k2秒字段-n进行数字排序，保留-s原始订单，-u删除重复项。

Answer 2

使用awk命令：

awk '{if(!a[$2]){a[$2]=1; print}}' file.dat

说明：

{
  # 'a' is a lookup table (array) which will contain all numbers
  # that have been printed so far. It will be initialized as an empty
  # array on its first usage by awk. So you don't have to care about.
  # $2 is the second 'column' in the line -> the number
  if(!a[$2]) 
  {
    # set index in the lookup table. This way the if statement will 
    # fail for the next line with the same number at the end
    a[$2]=1;
    # print the whole current line
    print
  }
}

Answer 3

使用sort和uniq：

sort -n -k2 input | uniq -f1

Answer 4

perl -ane 'print unless $a{$F[1]}++' file

使用shell脚本在文件中获取1行相同的字段

4 个答案: