Question

我有一个由格式为

的键/值对组成的排序文本文件

"String" = int,

使用UNIX排序实用程序对它们进行排序。例如：

"'Nessy's Trophy Pincers" = 81859,
"1 Handed Alliance Sword" = 119204,
"1 Handed Horde Axe" = 119206,
"10 Pound Mud Snapper" = 6292,
"100 Year Soy Sauce" = 74853,
"103 Pound Mightfish" = 13917,
"113 Pound Swordfish" = 39147,
"12 Pound Lobster" = 13909,
"12 Pound Mud Snapper" = 6294,
...

但是，其中一些字符串是重复的，但有不同的数字：

"Battleplate of the Prehistoric Marauder" = 99047,
"Battleplate of the Prehistoric Marauder" = 99197,
"Battleplate of the Prehistoric Marauder" = 99411,
"Battleplate of the Prehistoric Marauder" = 99603,
"Battlescar Boots" = 28747,
...

我想在重复项中添加一个数字，以便上面的段看起来像这样：

"Battleplate of the Prehistoric Marauder" = 99047,
"Battleplate of the Prehistoric Marauder 1" = 99197,
"Battleplate of the Prehistoric Marauder 2" = 99411,
"Battleplate of the Prehistoric Marauder 3" = 99603,
"Battlescar Boots" = 28747,
...

使用sed或awk或任何其他命令行实用程序，我需要输入什么才能为我执行此操作？

Answer 1

这是如何在bash中完成的。读取表单stdin，写入stdout

#!/bin/bash

declare -A known  # an associative array

while read line
do

   eval set $line

   string="$1"
   number="$3"

   i="${known["$string"]}"

   if test -z "$i"
   then
      known["$string"]=0
   else
      let ++i
      known["$string"]=$i
      string="$string $i"
   fi

   echo '"'"$string"'"' = $number

done

上述版本不需要对源进行排序。如果您有非常大的输入，您可能更喜欢以下版本，它使用了输入已排序且不需要关联数组这一事实：

#!/bin/bash

saved=

while read line
do

   eval set $line

   string="$1"
   number="$3"

   if [ "$string" != "$saved" ]
   then
      i=0
   else
      let ++i
      string="$string $i"
   fi

   saved="$1"

   echo '"'"$string"'"' = $number

done

Answer 2

从命令行使用perl，

perl -pe 's/"(.+)\K(?=")/( map $_ ? " $_" : "", $h{$1}++ )[0]/e' file

Answer 3

$ awk -F'" *= *' 'c[$1]++{sub(FS," "c[$1]"&")}1' file
"Battleplate of the Prehistoric Marauder" = 99047,
"Battleplate of the Prehistoric Marauder 2" = 99197,
"Battleplate of the Prehistoric Marauder 3" = 99411,
"Battleplate of the Prehistoric Marauder 4" = 99603,
"Battlescar Boots" = 28747,

Answer 4

这是一个Perl解决方案：

use strict;
use warnings;

use Text::Balanced qw(extract_delimited);

my $fn = 'File';
my %h;
open (my $fh, "<", $fn) or die "Could not open file '$fn': $!\n";
while (<$fh>) {
    my ($title, $remainder) = extract_delimited($_, '"', '[^"]*');
    if ($h{$title}++) {
        $title = modify_title($title, $h{$title});
    }
    print "${title}$remainder";
}
close($fh);

sub modify_title {
    my ($title, $n) = @_;

    $n--;
    $title =~ s/"$/ $n"/;
    return $title;
}

Answer 5

以下是awk版本：

awk -F\" '{a[$2]++} {if (a[$2]-1) $0=FS$2" "a[$2]FS$3}1' file
"Battleplate of the Prehistoric Marauder" = 99047,
"Battleplate of the Prehistoric Marauder 2" = 99197,
"Battleplate of the Prehistoric Marauder 3" = 99411,
"Battleplate of the Prehistoric Marauder 4" = 99603,
"Battlescar Boots" = 28747,

副本不需要按顺序排列。

另一个版本：

awk -F= '{a[$1]++} {if (a[$1]-1) sub(/[^^]\"/," "a[$1]"\"")}1' file

将数字附加到相似行的部分

5 个答案: