Question

我有一个文件，其中包含单词之间的映射。我必须引用该文件，并将这些单词替换为某些文件中的映射单词。例如，下面的文件具有映射的单词表，如

1.12.2.4               1
1.12.2.7               12
1.12.2.2               5
1.12.2.4               4
1.12.2.6               67
1.12.2.12              5

我将有许多具有这些关键词的文件（1.12.2。*）。我想搜索这些关键词，并将这些词替换为从该文件中获取的相应映射。如何在shell中执行此操作。假设一个文件包含以下行说

The Id of the customer is 1.12.2.12. He is from Grg. 
The Name of the machine is ASB
The id is 1.12.2.4. He is from Psg.

执行脚本后，数字“1.12.2.12”和“1.12.2.4”应替换为5和4（从主文件中引用）。任何人都可以帮助我吗？

Answer 1

使用GNU awk的一种方式：

awk 'FNR==NR { array[$1]=$2; next } { for (i in array) gsub(i, array[i]) }1' master.txt file.txt

结果：

The Id of the customer is 5. He is from Grg.
The Name of the machine is ASB
The id is 4. He is from Psg.

要将输出保存到文件：

awk 'FNR==NR { array[$1]=$2; next } { for (i in array) gsub(i, array[i]) }1' master.txt file.txt > name_of_your_output_file.txt

<强>解释

FNR==NR { ... }   # FNR is the current record number, NR is the record number
                  # so FNR==NR simply means: "while we process the first file listed
                  # in this case it's "master.txt"
array[$1]=$2      # add column 1 to an array with a value of column 2
next              # go onto the next record

{                 # this could be written as: FNR!=NR
                  # so this means "while we process the second file listed..."
for (i in array)  # means "for every element/key in the array..."
gsub(i, array[i]) # perform a global substitution on each line replacing the key
                  # with it's value if found
}1                # this is shorthand for 'print'

添加字边界会使匹配更加严格：

awk 'FNR==NR { array[$1]=$2; next } { for (i in array) gsub("\\<"i"\\>", array[i]) }1' master.txt file.txt

Answer 2

您可以sed为您编写sed脚本：

映射：

cat << EOF > mappings
1.12.2.4               1
1.12.2.7               12
1.12.2.2               5
1.12.2.4               4
1.12.2.6               67
1.12.2.12              5
EOF

输入文件：

cat << EOF > infile
The Id of the customer is 1.12.2.12. He is from Grg. 
The Name of the machine is ASB
The id is 1.12.2.4. He is from Psg.
EOF

根据映射生成脚本（GNU sed）：

sed -r -e 's:([^ ]*) +(.*):s/\\b\1\\b/\2/g:' mappings

输出：

s/\b1.12.2.4\b/1/g
s/\b1.12.2.7\b/12/g
s/\b1.12.2.2\b/5/g
s/\b1.12.2.4\b/4/g
s/\b1.12.2.6\b/67/g
s/\b1.12.2.12\b/5/g

使用其他sed（GNU sed）进行评估：

sed -r -e 's:([^ ]*) +(.*):s/\\b\1\\b/\2/g:' mappings | sed -f - infile

输出：

The Id of the customer is 5. He is from Grg. 
The Name of the machine is ASB
The id is 1. He is from Psg.

请注意，映射被视为正则表达式，例如点（.）可以表示任何字符，可能需要在映射文件中或在生成sed脚本时转义。

Answer 3

由于你没有提供任何例子，我想这就是你想要的：

输入文件

> cat temp
1.12.2.4  1
1.12.2.7  12
1.12.2.2  5
1.12.2.4  4
1.12.2.6  67
1.12.2.12  5

要关联的文件

> cat temp2
The Id of the customer is 1.12.2.12. He is from Grg. 
The Name of the machine is ASB
The id is 1.12.2.4. He is from Psg.

<强>输出

> temp.pl
The Id of the customer is 5. He is from Grg. 
The Name of the machine is ASB
The id is 4. He is from Psg

>

以下是perl脚本。

#!/usr/bin/perl

use strict;
use warnings;

my %hsh=();

open (MYFILE, 'temp');
open (MYFILE2, 'temp2');

while (<MYFILE>) {
my@arr = split/\s+/;
$hsh{$arr[0]} = $arr[1];
}
my $flag;
while(<MYFILE2>)
{
$flag=0;
my $line=$_;
foreach my $key (keys %hsh)
{
   if($line=~/$key/)
   {
    $flag=1; 
    $line=~s/$key/$hsh{$key}/g;
    print $line;
   }
}
  if($flag!=1)
  {
  print $line;
  $flag=0;
  }
}
close(MYFILE);
close(MYFILE2);

用另一个文件中指定的值替换字段

3 个答案: