Question

所以我试图过滤“重复”＃39;来自文件的结果。

我有一个看起来像的文件：

7 14 35 35 4 23
23 53 85 27 49 1
35 4 23 27 49 1
....

我精神上可以分为第1项和第2项。第1项是每行的前3个数字，第2项是每行的最后3个数字。

我还有一个＆＃39;项目列表

：

在文件中的某一点，让我们说第3行（这个数字是任意的，例如），＆＃39;项目＆＃39;可以分开。可以说第1行和第2行是红色，第3和第4行是蓝色。

我想在原始文件中确保没有红色或蓝色蓝色 - 只有红色蓝色或蓝色红色，同时保留原始数字。理想情况下，文件来自：

7 14 35 35 4 23 (red blue)
23 53 85 27 49 1 (red blue)
35 4 23 27 49 1 (blue blue)
....

到

7 14 35 35 4 23 (red blue)
23 53 85 27 49 1 (red blue)
....

我很难想出一个好的（或任何）方法。任何帮助表示赞赏。

编辑：

我有一个过滤脚本，如果行上有蓝色或红色，则会抓取行：

#!/bin/bash

while read name; do
  grep "$name" Twoitems
done < Itemblue > filtered

while read name2; do
  grep "$name2" filtered
done < Itemred > double filtered

EDIT2：

输入项目文件的示例：

Answer 1

让我们说file1内容

7 14 35 35 4 23
23 53 85 27 49 1
35 4 23 27 49 1

和file2内容

然后，您可以使用哈希将line-nos映射到基于cutoff的颜色并使用该哈希值，比较第一个文件中的行，以便在每行的第三个空格上分割后存在不同的颜色。

我想你需要类似下面的脚本。可以根据你的要求随意修改它。

#!/usr/bin/perl
use strict;
use warnings;

#declare a global hash to keep track of line and colors
my %color;

#open both the files     
open my $fh1, '<', 'file1' or die "unable to open file1: $! \n";
open my $fh2, '<', 'file2' or die "unable to open file2: $! \n";

#iterate over the second file and store the lines as
#red or blue in hash based on line nos
while(<$fh2>){
        chomp;
        if($. <= 2){
        $color{$_}="red";
        }
        else{
           $color{$_}="blue";
        }
}
#close second file
close($fh2);

#iterate over first file
while(<$fh1>){
      chomp;
      #split the line on 3rd space 
      my ($part1,$part2)=split /(?:\d+\s){3}\K/;
      #remove trailing spaces present 
      $part1=~s/\s+$//;
      #print if $part1 and $part does not belong to same color
      print "$_\n" if($color{$part1} ne $color{$part2});
}
#close first file
close($fh1);

Answer 2

使用带有grep选项的-f非常容易。

首先，从您的items文件中生成四个'pattern'文件。我在这里使用AWK，但你不妨使用Perl或者不使用它。按照你的例子，我把'拆分'放在第2行和第3行之间;请在必要时进行调整。

awk 'NR <= 2 {print "^" $0 " "}' items.txt > starts_red.txt
awk 'NR <= 2 {print " " $0 "$"}' items.txt > ends_red.txt

awk 'NR >= 3 {print "^" $0 " "}' items.txt > starts_blue.txt
awk 'NR >= 3 {print " " $0 "$"}' items.txt > ends_blue.txt

接下来，使用模式文件（选项grep）使用-f管道来过滤输入文件中的相应行。

grep -f starts_red.txt  input.txt | grep -f ends_blue.txt > red_blue.txt
grep -f starts_blue.txt input.txt | grep -f ends_red.txt  > blue_red.txt

最后，连接两个输出文件。当然，您也可以使用>>让第二个grep管道将其输出附加到第一个输出。

过滤输入文件

2 个答案: