Question

我有以下类型的数据：

 1  abc    xyz   -    -    2   mno
 2  lnm    dse   -    -    3   pqr
 3  ebe    aaa   xhd  asw  4   pow
 4  abc    fww   wrw  ffp  3   ffw

我想删除符合以下两个条件的行：

4th＆amp;第5栏是空白
相应行的行号不包含在任何其他行的第6列中

在这种情况下，应删除第1行。对于这种情况，我怎么能用sed / awk或最合适的脚本语言呢。

Answer 1

这可能是这样的 -

awk 'NR==FNR{a[$6];next} 
($4 ~ /[- ]/ && $5 ~ /[- ]/) && !($1 in a){next}1' file file

条件：

如果Column 4 and Column 5 are blank AND Index not present in Column 6，我们会跳过该行，然后打印其他所有内容。

说明：

我们使用NR和FNR内置变量并将相同的文件传递两次。在第一次运行中，我们扫描文件并将Column 6存储在一个数组中。 next用于防止第二个pattern{action}语句在读取第一个文件之前运行。一旦完全读取文件，我们就会根据您的情况测试相同的文件。如果第4列和第5列是空白的，我们查看索引，如果它不在数组中，那么我们使用next跳过该行，否则我们打印它。

测试：

[jaypal:~/Temp] cat file
 1  abc    xyz   -    -    2   mno
 2  lnm    dse   -    -    3   pqr
 3  ebe    aaa   xhd  asw  4   pow
 4  abc    fww   wrw  ffp  3   ffw

[jaypal:~/Temp] awk 'NR==FNR{a[$6];next} ($4 ~ /[- ]/ && $5 ~ /[- ]/) && !($1 in a){next}1' file file
 2  lnm    dse   -    -    3   pqr
 3  ebe    aaa   xhd  asw  4   pow
 4  abc    fww   wrw  ffp  3   ffw

Answer 2

使用perl的可能解决方案：

script.pl 的内容：

use warnings;
use strict;

## Accept one argument, the input file.
@ARGV == 1 or die qq[Usage: perl $0 input-file\n];

my ($lines, %hash);

## Process file.
while ( <> ) {
        ## Remove leading and trailing spaces for each line.
        s/^\s*//;
        s/\s*$//;

        ## Get both indexes.
        my ($idx1, $idx2) = (split)[0,5];

        ## Save line and index1.
        push @{$lines}, [$_, $idx1];

        ## Save index2.
        $hash{ $idx2 } = 1;
}

## Process file for second time.
for ( @{$lines} ) {

        ## Get fields of the line.
        my @f = split /\s+/, $_->[0];

        ## If fourth and fifth fields are empty (-) and first index exists as second 
        ## index, go to next line without printing.
        if ( $f[3] eq qq[-] && $f[4] eq qq[-] && ! exists $hash{ $_->[1] } ) {
                next;
        }

        ## Print line.
        printf qq[%s\n], $_->[0];
}

运行脚本（ infile 包含要处理的数据）：

perl script.pl infile

结果：

2  lnm    dse   -    -    3   pqr
3  ebe    aaa   xhd  asw  4   pow
4  abc    fww   wrw  ffp  3   ffw

Answer 3

这可能对您有用：

sed -rn 's/^.*(\S+)\s+\S+$/\1/;H;${x;s/^|\n/:/gp}' file | 
sed -r '1{h;d};/^(\s*\S*){3}\s*-\s*-/{G;/^\s*(\S*).*:\1:/!d;s/\n.*//}' - file
 2  lnm    dse   -    -    3   pqr
 3  ebe    aaa   xhd  asw  4   pow
 4  abc    fww   wrw  ffp  3   ffw

说明：

阅读文件并构建一个由:
将表格（第一行）读入保留空间（HS），然后再次读取该文件。
第5列和第6列仅包含-时。
- 将查找表附加到模式空间（PS）
- 使用第一列作为键进行查找，如果失败则删除它线。
- 对于所有剩余行，请删除查找表。

如何使用sed或awk删除符合特定字段条件的行？

3 个答案:

条件：

说明：

测试：