用另一个文件中指定的值替换字段

时间:2012-09-13 05:56:13

标签: shell sed awk

我有一个文件,其中包含单词之间的映射。我必须引用该文件,并将这些单词替换为某些文件中的映射单词。例如,下面的文件具有映射的单词表,如

1.12.2.4               1
1.12.2.7               12
1.12.2.2               5
1.12.2.4               4
1.12.2.6               67
1.12.2.12              5

我将有许多具有这些关键词的文件(1.12.2。*)。我想搜索这些关键词,并将这些词替换为从该文件中获取的相应映射。如何在shell中执行此操作。假设一个文件包含以下行说

The Id of the customer is 1.12.2.12. He is from Grg. 
The Name of the machine is ASB
The id is 1.12.2.4. He is from Psg.

执行脚本后,数字“1.12.2.12”和“1.12.2.4”应替换为5和4(从主文件中引用)。任何人都可以帮助我吗?

3 个答案:

答案 0 :(得分:5)

使用GNU awk的一种方式:

awk 'FNR==NR { array[$1]=$2; next } { for (i in array) gsub(i, array[i]) }1' master.txt file.txt

结果:

The Id of the customer is 5. He is from Grg.
The Name of the machine is ASB
The id is 4. He is from Psg.

要将输出保存到文件:

awk 'FNR==NR { array[$1]=$2; next } { for (i in array) gsub(i, array[i]) }1' master.txt file.txt > name_of_your_output_file.txt

<强>解释

FNR==NR { ... }   # FNR is the current record number, NR is the record number
                  # so FNR==NR simply means: "while we process the first file listed
                  # in this case it's "master.txt"
array[$1]=$2      # add column 1 to an array with a value of column 2
next              # go onto the next record

{                 # this could be written as: FNR!=NR
                  # so this means "while we process the second file listed..."
for (i in array)  # means "for every element/key in the array..."
gsub(i, array[i]) # perform a global substitution on each line replacing the key
                  # with it's value if found
}1                # this is shorthand for 'print'

添加字边界会使匹配更加严格:

awk 'FNR==NR { array[$1]=$2; next } { for (i in array) gsub("\\<"i"\\>", array[i]) }1' master.txt file.txt

答案 1 :(得分:5)

您可以sed为您编写sed脚本:

映射:

cat << EOF > mappings
1.12.2.4               1
1.12.2.7               12
1.12.2.2               5
1.12.2.4               4
1.12.2.6               67
1.12.2.12              5
EOF

输入文件:

cat << EOF > infile
The Id of the customer is 1.12.2.12. He is from Grg. 
The Name of the machine is ASB
The id is 1.12.2.4. He is from Psg.
EOF

根据映射生成脚本(GNU sed):

sed -r -e 's:([^ ]*) +(.*):s/\\b\1\\b/\2/g:' mappings

输出:

s/\b1.12.2.4\b/1/g
s/\b1.12.2.7\b/12/g
s/\b1.12.2.2\b/5/g
s/\b1.12.2.4\b/4/g
s/\b1.12.2.6\b/67/g
s/\b1.12.2.12\b/5/g

使用其他sed(GNU sed)进行评估:

sed -r -e 's:([^ ]*) +(.*):s/\\b\1\\b/\2/g:' mappings | sed -f - infile

输出:

The Id of the customer is 5. He is from Grg. 
The Name of the machine is ASB
The id is 1. He is from Psg.

请注意,映射被视为正则表达式,例如点(.)可以表示任何字符,可能需要在映射文件中或在生成sed脚本时转义。

答案 2 :(得分:0)

由于你没有提供任何例子,我想这就是你想要的:

输入文件

> cat temp
1.12.2.4  1
1.12.2.7  12
1.12.2.2  5
1.12.2.4  4
1.12.2.6  67
1.12.2.12  5

要关联的文件

> cat temp2
The Id of the customer is 1.12.2.12. He is from Grg. 
The Name of the machine is ASB
The id is 1.12.2.4. He is from Psg.

<强>输出

> temp.pl
The Id of the customer is 5. He is from Grg. 
The Name of the machine is ASB
The id is 4. He is from Psg

>

以下是perl脚本。

#!/usr/bin/perl

use strict;
use warnings;

my %hsh=();

open (MYFILE, 'temp');
open (MYFILE2, 'temp2');

while (<MYFILE>) {
my@arr = split/\s+/;
$hsh{$arr[0]} = $arr[1];
}
my $flag;
while(<MYFILE2>)
{
$flag=0;
my $line=$_;
foreach my $key (keys %hsh)
{
   if($line=~/$key/)
   {
    $flag=1; 
    $line=~s/$key/$hsh{$key}/g;
    print $line;
   }
}
  if($flag!=1)
  {
  print $line;
  $flag=0;
  }
}
close(MYFILE);
close(MYFILE2);