如何将字母替换为某些数字

时间:2019-02-14 20:26:05

标签: perl awk sed

我有大量如下数据

NDDDDTSVCLGTRQCSWFAGCTNRTWNSSA 0
VCLGTRQCSWFAGCTNRTWNSSAVPLIGLP 0
LTWSGNDTCLYSCQNQTKGLLYQLFRNLFC 0
CQNQTKGLLYQLFRNLFCSYGLTEAHGKWR 0
ITNDKGHDGHRTPTWWLTGSNLTLSVNNSG 0
GHRTPTWWLTGSNLTLSVNNSGLFFLCGNG 0
FLCGNGVYKGFPPKWSGRCGLGYLVPSLTR 0
KGFPPKWSGRCGLGYLVPSLTRYLTLNASQ 0
QSVCMECQGHGERISPKDRCKSCNGRKIVR 1 

我想使用以下键将字母替换为数字

A   1
R   2
N   3
D   4
B   5
C   6
E   7
Q   8
Z   9
G   10
H   11
I   12
L   13
K   14
M   15
F   16
P   17
S   18
T   19
W   20
Y   21
V   22

首先,我想删除所有靠近字母的数字,然后替换字母,所以让我们先看一个像

NDDDDTSVCLGTRQCSWFAGCTNRTWNSSA 

将拥有

3 4 4 4 4 19 18 22 6 19 13 10 19 2 8 6 18 20 16 1 10 6 19 3 2 19 20 3 18 18 1

其余各行与我的行数相同

4 个答案:

答案 0 :(得分:5)

perl -e'
    use autodie;
    my %charmap = (
        A =>  1, R =>  2, N =>  3, D =>  4, B =>  5, C =>  6, E =>  7, Q =>  8,
        Z =>  9, G => 10, H => 11, I => 12, L => 13, K => 14, M => 15, F => 16,
        P => 17, S => 18, T => 19, W => 20, Y => 21, V => 22,
    );
    while (<>) {
        s{(.)}{ ($charmap{$1} // $1) . " " }ge;
        print;
    }
' file

或者只是

perl -pe'
    BEGIN { @charmap{ split //, "ARNDBCEQZGHILKMFPSTWYV" } = 1..22 }
    s{(.)}{ ($charmap{$1} // $1) . " " }ge;
' file

答案 1 :(得分:3)

在任何UNIX盒子上的任何外壳中都有任何awk:

$ cat tst.awk
BEGIN {
    chars = "ARNDBCEQZGHILKMFPSTWYV"
    for (i=1; i<=length(chars); i++) {
        char = substr(chars,i,1)
        map[char] = i
    }
}
{
    out = ""
    chars = $1
    for (i=1; i<=length(chars); i++) {
        char = substr(chars,i,1)
        out = (out == "" ? "" : out " ") (char in map ? map[char] : char)
    }
    print out
}

$ awk -f tst.awk file
3 4 4 4 4 19 18 22 6 13 10 19 2 8 6 18 20 16 1 10 6 19 3 2 19 20 3 18 18 1
22 6 13 10 19 2 8 6 18 20 16 1 10 6 19 3 2 19 20 3 18 18 1 22 17 13 12 10 13 17
13 19 20 18 10 3 4 19 6 13 21 18 6 8 3 8 19 14 10 13 13 21 8 13 16 2 3 13 16 6
6 8 3 8 19 14 10 13 13 21 8 13 16 2 3 13 16 6 18 21 10 13 19 7 1 11 10 14 20 2
12 19 3 4 14 10 11 4 10 11 2 19 17 19 20 20 13 19 10 18 3 13 19 13 18 22 3 3 18 10
10 11 2 19 17 19 20 20 13 19 10 18 3 13 19 13 18 22 3 3 18 10 13 16 16 13 6 10 3 10
16 13 6 10 3 10 22 21 14 10 16 17 17 14 20 18 10 2 6 10 13 10 21 13 22 17 18 13 19 2
14 10 16 17 17 14 20 18 10 2 6 10 13 10 21 13 22 17 18 13 19 2 21 13 19 13 3 1 18 8
8 18 22 6 15 7 6 8 10 11 10 7 2 12 18 17 14 4 2 6 14 18 6 3 10 2 14 12 22 2

答案 2 :(得分:2)

Perl替代解决方案:

#!/usr/bin/perl
use strict;
use warnings;

my %key = (
    A =>  1, R =>  2, N =>  3, D =>  4, B =>  5,
    C =>  6, E =>  7, Q =>  8, Z =>  9, G => 10,
    H => 11, I => 12, L => 13, K => 14, M => 15,
    F => 16, P => 17, S => 18, T => 19, W => 20,
    Y => 21, V => 22,
);

while (<STDIN>) {
    my($text) = /^(\w+)/;
    print join(' ',
               map { $key{$_} }
               split(//, $text)
          ), "\n";
}

exit 0;

以您给定的文本输出:

$ perl dummy.pl <dummy.txt
3 4 4 4 4 19 18 22 6 13 10 19 2 8 6 18 20 16 1 10 6 19 3 2 19 20 3 18 18 1
22 6 13 10 19 2 8 6 18 20 16 1 10 6 19 3 2 19 20 3 18 18 1 22 17 13 12 10 13 17
13 19 20 18 10 3 4 19 6 13 21 18 6 8 3 8 19 14 10 13 13 21 8 13 16 2 3 13 16 6
6 8 3 8 19 14 10 13 13 21 8 13 16 2 3 13 16 6 18 21 10 13 19 7 1 11 10 14 20 2
12 19 3 4 14 10 11 4 10 11 2 19 17 19 20 20 13 19 10 18 3 13 19 13 18 22 3 3 18 10
10 11 2 19 17 19 20 20 13 19 10 18 3 13 19 13 18 22 3 3 18 10 13 16 16 13 6 10 3 10
16 13 6 10 3 10 22 21 14 10 16 17 17 14 20 18 10 2 6 10 13 10 21 13 22 17 18 13 19 2
14 10 16 17 17 14 20 18 10 2 6 10 13 10 21 13 22 17 18 13 19 2 21 13 19 13 3 1 18 8
8 18 22 6 15 7 6 8 10 11 10 7 2 12 18 17 14 4 2 6 14 18 6 3 10 2 14 12 22 2

再三考虑...

由于OP希望混淆明文,因此更合适的解决方案恕我直言应该是这样的:

$ bash <dummy.txt -c "$(echo /Td6WFoAAATm1rRGBMCtAbgBIQEWAAAAAAAAACsG0SbgALcApV0AOBlKq3igoJRmX9TqJifIRDIcDLdDtNRSv+tJBsifrrsdnlllNt2qqnlz0/uBmSnlO0FTKjKH/HXplJm9LaV7kXiNp/ZWDsyVqoV8EPjIEHHkXXd6jKahyq7tcCA4NGTHp/pwmk8jith6j/dcX67QCKmL0UtZUz9BqVWefD41lbrTNazbD8IP6zMLmAVxJav51SSTHzsUqhUfqhVmLsUg8sJkgloAAAAAAOMYtQXt21WNAAHJAbgBAABTvtYRscRn+wIAAAAABFla | base64 -d | xzcat)"
3 4 4 4 4 19 18 22 6 13 10 19 2 8 6 18 20 16 1 10 6 19 3 2 19 20 3 18 18 1
22 6 13 10 19 2 8 6 18 20 16 1 10 6 19 3 2 19 20 3 18 18 1 22 17 13 12 10 13 17
13 19 20 18 10 3 4 19 6 13 21 18 6 8 3 8 19 14 10 13 13 21 8 13 16 2 3 13 16 6
6 8 3 8 19 14 10 13 13 21 8 13 16 2 3 13 16 6 18 21 10 13 19 7 1 11 10 14 20 2
12 19 3 4 14 10 11 4 10 11 2 19 17 19 20 20 13 19 10 18 3 13 19 13 18 22 3 3 18 10
10 11 2 19 17 19 20 20 13 19 10 18 3 13 19 13 18 22 3 3 18 10 13 16 16 13 6 10 3 10
16 13 6 10 3 10 22 21 14 10 16 17 17 14 20 18 10 2 6 10 13 10 21 13 22 17 18 13 19 2
14 10 16 17 17 14 20 18 10 2 6 10 13 10 21 13 22 17 18 13 19 2 21 13 19 13 3 1 18 8
8 18 22 6 15 7 6 8 10 11 10 7 2 12 18 17 14 4 2 6 14 18 6 3 10 2 14 12 22 2

答案 3 :(得分:1)

另一个awk

$ awk 'NR==FNR {a[$1]=$2; next} 
               {n=length($1); 
                for(i=1;i<=n;i++) 
                   printf "%s", a[substr($1,i,1)] (i==n?ORS:OFS)}' mapfile datafile

3 4 4 4 4 19 18 22 6 13 10 19 2 8 6 18 20 16 1 10 6 19 3 2 19 20 3 18 18 1
22 6 13 10 19 2 8 6 18 20 16 1 10 6 19 3 2 19 20 3 18 18 1 22 17 13 12 10 13 17
13 19 20 18 10 3 4 19 6 13 21 18 6 8 3 8 19 14 10 13 13 21 8 13 16 2 3 13 16 6
6 8 3 8 19 14 10 13 13 21 8 13 16 2 3 13 16 6 18 21 10 13 19 7 1 11 10 14 20 2
12 19 3 4 14 10 11 4 10 11 2 19 17 19 20 20 13 19 10 18 3 13 19 13 18 22 3 3 18 10
10 11 2 19 17 19 20 20 13 19 10 18 3 13 19 13 18 22 3 3 18 10 13 16 16 13 6 10 3 10
16 13 6 10 3 10 22 21 14 10 16 17 17 14 20 18 10 2 6 10 13 10 21 13 22 17 18 13 19 2
14 10 16 17 17 14 20 18 10 2 6 10 13 10 21 13 22 17 18 13 19 2 21 13 19 13 3 1 18 8
8 18 22 6 15 7 6 8 10 11 10 7 2 12 18 17 14 4 2 6 14 18 6 3 10 2 14 12 22 2

但是,没有提供未指定的缺失映射,即,如果在映射文件中未列出字符,则将忽略它们。

如果目标是加密,我将提出另一种方法:

首先让我们生成一个映射(或加密密钥)

$ key=$(printf "%s\n" {A..Z} | shuf | paste -sd' ' | tr -d ' ')

$ echo "$key"
CNYSGFRDKQTOXJVLEWBAHZPMUI

现在,您可以简单地加密/解密文件内容

$ tr [A-Z] "$key" < datafile  > file.encrypted

并反转

$ tr "$key" [A-Z] < file.encrypted > file.decrypted

显然,您需要保存密钥。