从文件中的键列表中查找文件中缺少的键

时间:2013-07-02 07:38:44

标签: bash shell

所以,我有一个keys.txt文件列出每个密钥(每行一个),例如

VIEW_ACCOUNT_NAME_LABEL  
VIEW_ACCOUNT_NAME_DESCR  
VIEW_ACCOUNT_STREET_LABEL  
VIEW_ACCOUNT_CITY_SUBURB_LABEL  
VIEW_ACCOUNT_ZIP_POSTCODE_LABEL  
VIEW_ACCOUNT_COUNTRY_LABEL

各种匹配语言文件,为键提供值,例如每行有一个条目的en-GB.view.acccount.ini,如下所示:

VIEW_ACCOUNT_NAME_LABEL="Name:"
VIEW_ACCOUNT_NAME_DESCR="Name of the account holder."
VIEW_ACCOUNT_STREET_LABEL="Street:"
VIEW_ACCOUNT_CITY_SUBURB_LABEL="City/Suburb:"
VIEW_ACCOUNT_ZIP="Zip Code"
VIEW_ACCOUNT_COUNTRY_LABEL="Country"

n.b。有许多密钥和语言文件,实际文件有更多条目 - 通常每种语言超过1000个。

我需要能够找到

  1. 语言文件中缺少哪些密钥(例如VIEW_ACCOUNT_ZIP_POSTCODE_LABEL
  2. 哪些密钥位于语言文件中但未包含在密钥文件中(通常是过时的密钥,例如VIEW_ACCOUNT_ZIP
  3. 对于第一个要求,我尝试使用带有grep反转匹配选项的-v,但结果不符合我的预期:

    cppl ~ grep -v --file=keys.txt en-GB.view.acccount.ini
    VIEW_ACCOUNT_NAME_LABEL="Name:"
    VIEW_ACCOUNT_NAME_DESCR="Name of the account holder."
    VIEW_ACCOUNT_STREET_LABEL="Street:"
    VIEW_ACCOUNT_CITY_SUBURB_LABEL="City/Suburb:"
    VIEW_ACCOUNT_ZIP="Zip Code"
    cppl ~ 
    

3 个答案:

答案 0 :(得分:4)

使用comm

要查找语言文件中缺少哪些键:

$ comm -23 <(sort keys.txt) <(cut -d= -f1 en-GB.view.acccount.ini | sort) 
VIEW_ACCOUNT_ZIP_POSTCODE_LABEL

要查找语言文件中但未包含在密钥文件中的键:

$ comm -13 <(sort keys.txt) <(cut -d= -f1 en-GB.view.acccount.ini | sort)
VIEW_ACCOUNT_ZIP

答案 1 :(得分:0)

您可以使用标准的unix实用程序joinuniq来执行此操作。这是一种方法。

我假设您的密钥文件在以下示例中名为file1

生成仅包含键的文件,而不是值。

sed 's/=.*//' en-GB.view.acccount.ini > file2

您现在只有file1file2只包含密钥。对于这个例子:

$ cat file1
A
B
C
D

$ cat file2
C
D
E

现在,您可以使用joinsortuniq的组合来获得所需的输出。

# Keys which are common to both files.
$ join file1 file2 | cat - file1 | sort | uniq -d
C
D

# Keys in file1 but not in file2
$ join file1 file2 | cat - file1 | sort | uniq -u
A
B

# Keys in file2 but not in file1
$ join file1 file2 | cat - file2 | sort | uniq -u
E

答案 2 :(得分:0)

你能为此使用perl吗?如果是这样,perl让这非常容易。这是我掀起的一个快速而又脏的剧本。根据您的喜好进行修改。

#!/usr/bin/perl -w

# usage:  validate keys.txt file1.ini [file2.ini [file3.ini [...]]]

open my $keys_file, "<", $ARGV[0] or die "cannot open $ARGV[0] for reading";

my %keys = ( map { chomp; s/\s//g; $_ => 0 } <$keys_file> );

close $keys_file;

sub validate_file
{
    my $filename = shift @_;
    my (@missing, @unexpected, @repeated);
    my %seen = %keys;

    open my $f, "<", $filename or die "cannot open $filename for reading";

    foreach my $line (<$f>)
    {
        chomp $line;

        if ($line =~ /\s*([^=]+)="[^"]*"/)
        {
            if (!defined $seen{$1})
            {
                push @unexpected, $1;
                $seen{$1} = 0;
            }
            $seen{$1}++;
        }
    }

    @missing  = grep { $seen{$_} == 0 } sort keys %keys;
    @repeated = grep { $seen{$_} >  1 } sort keys %keys;

    return \@missing, \@unexpected, \@repeated;
}


shift @ARGV;

foreach my $file (@ARGV)
{
    my ($missing, $unexpected, $repeated) = validate_file($file);

    print "\nFile $file:\n";
    print "Missing keys:\n", join("\n", @$missing), "\n";
    print "Unexpected keys:\n", join("\n", @$unexpected), "\n";
    print "Repeated keys:\n", join("\n", @$repeated), "\n";
}