Question

我具有以下文件内容：

T12 19/11/19 2000 
T12 18/12/19 2040 

T15 19/11/19 2000 
T15 18/12/19 2080

如何使用awk，bash等获取以下输出，我搜索了类似的示例，但到目前为止找不到：

T12 
19/11/19 2000 
18/12/19 2040 

T15 
19/11/19 2000 
18/12/19 2080

谢谢， S

Answer 1

能否请您尝试以下。此代码将按输入_文件中第一个字段出现的顺序打印输出。

awk '
!a[$1]++ && NF{
  b[++count]=$1
}
NF{
  val=$1
  $1=""
  sub(/^ +/,"")
  c[val]=(c[val]?c[val] ORS:"")$0
}
END{
  for(i=1;i<=count;i++){
    print b[i] ORS c[b[i]]
  }
}
'  Input_file

输出如下。

T12
19/11/19 2000
18/12/19 2040
T15
19/11/19 2000
18/12/19 2080

说明： 在此处添加上述代码的详细说明。

awk '                                  ##Starting awk program from here.
!a[$1]++ && NF{                        ##Checking condition if $1 is NOT present in array a and line is NOT NULL then do following.
  b[++count]=$1                        ##Creating an array named b whose index is variable count(every time its value increases cursor comes here) and its value is first field of current line.
}                                      ##Closing BLOCK for this condition now.
NF{                                    ##Checking condition if a line is NOT NULL then do following.
  val=$1                               ##Creating variable named val whose value is $1 of current line.
  $1=""                                ##Nullifying $1 here of current line.
  sub(/^ +/,"")                        ##Substituting initial space with NULL now in line.
  c[val]=(c[val]?c[val] ORS:"")$0      ##Creating an array c whose index is variable val and its value is keep concatenating to its own value with ORS value.
}                                      ##Closing BLOCK for this condition here.
END{                                   ##Starting END block for this awk program here.
  for(i=1;i<=count;i++){               ##Starting a for loop which runs from i=1 to till value of variable count.
    print b[i] ORS c[b[i]]             ##Printing array b whose index is i and array c whose index is array b value with index i.
  }
}                                      ##Closing this program END block here.
'  Input_file                          ##Mentioning Input_file name here.

Answer 2

这是一个简短的警告：

$ awk 'BEGIN{RS="";ORS="\n\n"}{printf "%s\n",$1; gsub($1" +",""); print}' file

它如何工作？ Awk知道 records 和 fields 的概念。

文件被拆分为 records ，其中连续记录由记录分隔符RS拆分。每个记录都分为多个字段，其中连续的字段由字段分隔符FS分隔。

默认情况下，记录分隔符RS设置为字符（\n），因此每个记录都是一行。记录分隔符具有以下定义：

RS： 字符串值RS的第一个字符应为输入记录分隔符；默认情况下为。如果RS包含多个字符，则结果不确定。 如果RS为null，则记录由由加上一个或多个空行组成的序列分隔，前导或尾随空行在输入的开头或结尾不应导致空记录，并且无论FS的值是什么，都应始终是字段分隔符。

因此，根据您提供的文件格式，我们可以基于RS=""定义记录。

默认情况下，字段分隔符设置为任意空格序列。因此$1将在单独的行上指向我们想要的特定单词。因此，我们使用printf进行打印，然后使用gsub删除对其的任何引用。

Answer 3

awk非常灵活，并提供了多种解决同一问题的方法。您已经获得的答案非常好。解决该问题的另一种方法是只保留一个将当前 field 1 作为其值的变量。（默认情况下未设置）当第一个字段更改时，只需将第一个字段输出为当前标题。否则，输出2 ^nd和3 ^rd字段。如果遇到空白行，只需输出换行符即可。

awk -v h= '
    NF < 3 {print ""; next}
    $1 != h {h=$1; print $1}
    {printf "%s %s\n", $2, $3}
' file

以上是3条规则。如果该行为空（检查的字段数少于三个（NF < 3），则输出换行并跳至下一个记录。第二行检查第一个字段是否不等于您当前的标题变量{{ 1}} –如果不是，请将h设置为新标题并将其输出。所有非空记录均输出2 ^nd和3 ^rd字段

结果

只需将命令粘贴到命令行上方，您将获得所需的结果，例如

在每个组之前的第一列中添加唯一值

3 个答案: