Question

我有一个带有n行和m列的制表符分隔文件，我想打印前三列并搜索模式并打印该列（如果存在）。我尝试在sed中搜索和打印，但无法打印前3列，然后搜索模式。

示例我有像

这样的文件

col1    col2    col3    col4    col5    col6
test    23      2323    32      635     36354
test2   354     35b     345     345     555
test4   486     g4      435     0.43    34
test5   0.6     35      0.34    0.234   34563

我想要的输出（例如，如果我搜索的模式是'col6'）

col1    col2    col3    col6
test    23      2323    36354
test2   354     35b     555
test4   486     g4      34
test5   0.6     35      34563

Answer 1

当awk正在读取第一行并确定哪个字段col6存在时，您可以遍历字段，

NR==1 {
        for (i=1; i<=NF; i++)
                if ($i == "col6")
                        column=i
}
{
        print $1, $2, $3, column ? $column : ""
}

它的作用是什么？

NR==1如果当前读取的记录（行）数为1，则迭代NF个字段数
- if ($i == "col6")如果当前列等于我们搜索的字符串，我们会将其保存在变量column中。
print $1, $2, $3, column ? $column : ""打印前三个字段。 column字段仅在设置时打印，如果不打印则为＃34;＆＃34;＆＃34;。

示例

$ awk 'NR==1{ for (i=1; i<=NF; i++)if ($i == "col6") column=i}{print $1, $2, $3, column ? $column : ""}' file col1 col2 col3 col6 test 23 2323 36354 test2 354 35b 555 test4 486 g4 34 test5 0.6 35 34563

Answer 2

cat /your/file | awk 'NR<=3 {print $0}' | grep 'your-pattern'

打印前三列

awk 'NR<=3 {print $0}' # if file has headers , NR<=4

并搜索模式并打印该列（如果存在）

grep 'foo'

Answer 3

在awk中：

$ awk -v p='col6' -F'\t' '
NR==1 {                                 # on the first record
    split($0,a,FS);                     # split header to array a
    for(i in a)                         # search for index of field with p
        if(a[i]==p)
            c=i                         # set it to c
} 
$0=$1 OFS $2 OFS $3 ( c ? OFS $c : "" ) # print $1-$3 and $c if it exists
' foo 
col1    col2    col3    col6
test    23      2323    36354
test2   354     35b     555
test4   486     g4      34
test5   0.6     35      34563

如果您还想要分隔输出选项卡，请在命令行中添加-v OFS='\t'。

如何打印前n列和匹配模式的列？

3 个答案: