我目前已经使用正则表达式替换实现了它的一部分,但它只覆盖了几个字符并且绝对不高效:)
无论如何,如果没有办法通过模块或功能处理, 有没有办法通过正则表达式来提高效率?
我想到了tr/[\+,\-,...]/[PLUS,MINUS,...]/cds;
但似乎tr只用char替换char而不是chars序列的char :(
任何想法?
Achim的
答案 0 :(得分:4)
回答tr
问题:
%subs = ( '+' => 'PLUS' );
my $pat = join '|', map quotemeta, keys %subs;
s/($pat)/$subs{$1}/g;
Base 26可以做,但实现起来有点困难和低效,因为26不是2的幂。但它绝对是你想要的。我会看到编码。
与此同时,这是一个基础16解决方案:
sub bytes_to_base16 {
my $e = unpack('H*', $_);
$e =~ tr/0123456789ABCDEFabcdef/ABCDEFGHIJKLMNOPKLMNOP/;
return $e;
}
sub base16_to_bytes {
my $e = $_[0];
$e =~ tr/ABCDEFGHIJKLMNOP/0123456789ABCDEF/;
return pack('H*', $_);
}
让我们看看基数26与基数16的有效性如何:
$ perl -MMath::BigInt -MMath::BigFloat -E'
my $n = Math::BigInt->new(1);
my $bs = 0;
for (1..10) {
$n <<= 8;
++$bs;
my $bd16 = 2*$bs;
my $bd26 = Math::BigFloat->new($n)->blog(26, 5)->bceil->numify;
say sprintf "%2d bytes takes %2d base16 digits or %2d base26 digits.".
" base26 is %3.0f%% of the size of base16.",
$bs, $bd16, $bd26, $bd26/$bd16*100;
}
'
1 bytes takes 2 base16 digits or 2 base26 digits. base26 is 100% of the size of base16.
2 bytes takes 4 base16 digits or 4 base26 digits. base26 is 100% of the size of base16.
3 bytes takes 6 base16 digits or 6 base26 digits. base26 is 100% of the size of base16.
4 bytes takes 8 base16 digits or 7 base26 digits. base26 is 88% of the size of base16.
5 bytes takes 10 base16 digits or 9 base26 digits. base26 is 90% of the size of base16.
6 bytes takes 12 base16 digits or 11 base26 digits. base26 is 92% of the size of base16.
7 bytes takes 14 base16 digits or 12 base26 digits. base26 is 86% of the size of base16.
8 bytes takes 16 base16 digits or 14 base26 digits. base26 is 88% of the size of base16.
9 bytes takes 18 base16 digits or 16 base26 digits. base26 is 89% of the size of base16.
10 bytes takes 20 base16 digits or 18 base26 digits. base26 is 90% of the size of base16.
有效的实施会产生稍微低效的输出。
$ perl -MMath::BigInt -MMath::BigFloat -E'
my $bs = 0;
for (1..10) {
++$bs;
my $bd16 = 2*$bs;
my $bd26 = int($bs/4)*7 + ($bs%4)*2;
say sprintf "%2d bytes takes %2d base16 digits or %2d base26 digits.".
" base26 is %3.0f%% of the size of base16.",
$bs, $bd16, $bd26, $bd26/$bd16*100;
}
'
1 bytes takes 2 base16 digits or 2 base26 digits. base26 is 100% of the size of base16.
2 bytes takes 4 base16 digits or 4 base26 digits. base26 is 100% of the size of base16.
3 bytes takes 6 base16 digits or 6 base26 digits. base26 is 100% of the size of base16.
4 bytes takes 8 base16 digits or 7 base26 digits. base26 is 88% of the size of base16.
5 bytes takes 10 base16 digits or 9 base26 digits. base26 is 90% of the size of base16.
6 bytes takes 12 base16 digits or 11 base26 digits. base26 is 92% of the size of base16.
7 bytes takes 14 base16 digits or 13 base26 digits. base26 is 93% of the size of base16.
8 bytes takes 16 base16 digits or 14 base26 digits. base26 is 88% of the size of base16.
9 bytes takes 18 base16 digits or 16 base26 digits. base26 is 89% of the size of base16.
10 bytes takes 20 base16 digits or 18 base26 digits. base26 is 90% of the size of base16.
请注意,有效实现对7个字节长的输入使用额外的数字。
因此,使用base26而不是base16的努力是否值得?可能不会,除非每个字节真的珍贵。
最后,这是一个基础26实现。
my @syms = ('A'..'Z');
my %syms = map { $syms[$_] => $_ } 0..$#syms;
sub bytes_to_base26 {
my $e = '';
my $full_blocks = int(length($_[0]) / 4);
for (0..$full_blocks-1) {
my $block = unpack('N', substr($_[0], $_*4, 4));
$e .= join '', @syms[
$block / 26**6 % 26,
$block / 26**5 % 26,
$block / 26**4 % 26,
$block / 26**3 % 26,
$block / 26**2 % 26,
$block / 26**1 % 26,
$block / 26**0 % 26,
];
}
my $extra = substr($_[0], $full_blocks*4);
for my $block (unpack('C*', $extra)) {
$e .= join '', @syms[
$block / 26**1 % 26,
$block / 26**0 % 26,
];
}
return $e;
}
sub base26_to_bytes {
my $d = '';
my $full_blocks = int(length($_[0]) / 7);
for (0..$full_blocks-1) {
my $block = 0;
$block = $block*26 + $syms{$_} for unpack '(a)*', substr($_[0], $_*7, 7);
$d .= pack('N', $block);
}
my $extra = substr($_[0], $full_blocks*7);
my @extra = unpack('(a)*', $extra);
while (@extra) {
my $block = 0;
$block = $block*26 + $syms{ shift(@extra) };
$block = $block*26 + $syms{ shift(@extra) };
$d .= pack('C', $block);
}
return $d;
}
答案 1 :(得分:3)
最简单的方法是使用base16编码,正如其他人所建议的那样,并将数字重新映射到字母 - 但是你只使用了26个字符中的16个,这很浪费。
最有效的编码可能是base26,但这将非常困难 - 实际上,您将整个输入视为一个大的二进制数,并将其从基数2转换为基数为26。
log2(26)刚刚超过4.7,所以最多(在没有压缩的情况下)你可以编码每个字母4.7位。较少浪费的编码可能编码7个字母中的4个字节(32位)。 7个字母为您提供大约32.9位信息,因此您不会丢失太多信息。它都可以用32位算法完成。然后,如果输入不是4个字节的倍数,你将不得不决定该怎么做。
(实际的实施留作练习 - 至少现在。)
答案 2 :(得分:0)
您可以使用Base32编码,包含26个大写字母和6个数字:
只需将$code
数组更改为您要使用的任何字符集。
编辑:糟糕,只是注意到你是Perl而不是PHP,抱歉。您应该能够在CPAN上找到执行相同操作的Base32模块。
编辑2:FWIW,我在CPAN上看到Convert :: Base32,Encode :: Base32和MIME :: Base32。
答案 3 :(得分:0)
为了一点乐趣,这是我的Enigma模拟器。没有一种简单的方法可以实现你想要做的事情,因为轮子没有任何转义字符,你发明的代表转义序列的任何序列都会显着降低密码的强度。
然而,8位拉丁输入可以使用65 +($ Char&amp; 15).65 +($ Char&gt;&gt; 4)从0-255映射到[AP] [AP],并且在输出时反转,但是RZ会被浪费,输入中会有很多漏洞,虽然这可以通过gzip首先解决。
德国人通常用X来表示空格,如果真的有必要拼写标点符号,试图避免拼写同样的东西两次。 我知道这很烦人,但事实就是如此。如果我们增加轮子上的字母数量,那么它就不再是Enigma机器了!
#!/usr/bin/perl
#Tinigma 2010 Usage:tinigma.pl 123 rng ini "GHWVYYDVPQGEWQWVT"
($n,$o,$p)=map(ord()-65,split//,uc$ARGV[1]);($z,$y,$x)=map(ord
()-65,split//,uc$ARGV[2]);($l,$m,$r)=map$_-1,split//,$ARGV[0];
$t=uc$ARGV[3];$t=~s/[^A-Z]//g;$b=26;$j=0;@N=qw(7 25 11 6 1);@R
=('EKMFLGDQVZNTOWYHXUSPAIBRCJ'x3,'AJDKSIRUXBLHWTMCQGZNPYFVOE'x
3,'BDFHJLCPRTXVZNYEIWGAKMUSQO'x3,'ESOVPZJAYQUIRHXLNFTGKDCMWB'x
3,'VZBRGITYUPSDNHLXAWMJQOFECK'x3,'YRUHQSLDPXNGOKMIEBFZCWVJAT'x
3);@t=split//,$t;for$v(@R){$i=0;for(split//,$v){$c=ord($_)-65;
$F[$j][$i]=$c;$R[$j][$c+$b*(int($i/$b))]=$i;$i++}$j++}@S=@{$F[
5]};$f=$y==$F[$m][$N[$m]]?1:0;$i=0;for(@t){if($f){$y++;$y%=$b;
$z++;$z%=$b;$f=0}if($x==$F[$r][$N[$r]]){$y++;$y%=$b;if($y==$F[
$m][$N[$m]]){$f=1}}$x++;$x%=$b;$e.=chr(($R[$r][$R[$m][$R[$l][$
S[$F[$l][$F[$m][$F[$r][ord($_)-39+$x-$n]-$x+$n+$y-$o]-$y+$o+$z
-$p]-$z+$p]+$z-$p]-$z+$p+$y-$o]-$y+$o+$x-$n]-$x+$n)%$b+65)}
print"$e\n"
答案 4 :(得分:0)
Keith Thompson和jrockway已经简要提及了此解决方案。
在这里,我们对其进行研究并实现。
问题很简单,只要您知道:
0
,1
,2
,…,但是也可以使用A
,B
,C
,…甚至{{ 1}},?
,?
,…。因此,一种仅使用?
-A
来编码(文本)文件的方法是:
Z
-A
在基数26中打印F。这是一个实现:
Z
这将打印编码#! /usr/bin/env perl
use strict;
use warnings;
use Math::BigInt try => 'GMP';
our $plaintextDigits = join('', map(chr, 0..255));
our $codeDigits = join('', 'A'..'Z');
sub baseConversion {
my ($str, $inDigits, $outDigits) = @_;
return Math::BigInt
->from_base($str, length $inDigits, $inDigits)
->to_base(length $outDigits, $outDigits);
}
sub encode {
return baseConversion shift, $main::plaintextDigits, $main::codeDigits;
}
sub decode {
return baseConversion shift, $main::codeDigits, $main::plaintextDigits;
}
my $input = 'String to be encoded. Or use `shift` to read an CLI argument.';
print "input:\n$input\n";
my $encoded = encode $input;
print "\nencoded:\n$encoded\n";
my $decoded = decode $encoded;
print "\ndecoded:\n$decoded\n";
,然后将其正确解码。
好处:
缺点:
实施说明:
ESQEKWWQLSBQHVKBCAQYKLXMVQRUFOOMPJGFTADLYTDQLFGTRTLWJBYTJICKUOFUVPHSHZHCRZKFMVSHRHCACZFUWTXVXUDRVKMIAIKK
上使用perl的utf8::encode
/ decode
。our $...Digits
和from_base
可能更有效地实现,因为默认实现可能不知道数字to_base
是连续的。答案 5 :(得分:-1)
Base64编码生成十六进制输出,表示16个可能的字符。因为字母表有26,所以你可以用数字交换数字。然后你将只使用16个字母的字母,但是你有一个只包含字母字母的字符串,它很容易编码解码并返回原始字符串。这是一个奇怪的问题(它看起来像家庭作业),但它会做的伎俩。
答案 6 :(得分:-2)
你已经指出了一个非常有损的翻译......这可能并不令人满意。
但是:
#!/usr/bin/perl use strict; use warnings; use 5.012; # 7095527/perl-how-to-encode-and-decode-characters-in-uppercase-alpha-letters-only my $string = "abcDEFghijklMNO1234567890pqr+_)!@#}{?"; my @arr = split //, uc($string); my (@intermediate, $char); for my $char(@arr) { if ($char =~ /[A-Z]/) { say "ENIGMA char found (possibly uc'ed): $char"; } else { say "WTF? \$char at line17 is !~ /[A-Z]/: $char"; next; } } =head OUTPUT: > SO7095527.pl ENIGMA char found (possibly uc'ed): A ENIGMA char found (possibly uc'ed): B ENIGMA char found (possibly uc'ed): C ENIGMA char found (possibly uc'ed): D ENIGMA char found (possibly uc'ed): E ENIGMA char found (possibly uc'ed): F ENIGMA char found (possibly uc'ed): G ENIGMA char found (possibly uc'ed): H ENIGMA char found (possibly uc'ed): I ENIGMA char found (possibly uc'ed): J ENIGMA char found (possibly uc'ed): K ENIGMA char found (possibly uc'ed): L ENIGMA char found (possibly uc'ed): M ENIGMA char found (possibly uc'ed): N ENIGMA char found (possibly uc'ed): O WTF? $char at line17 is !~ /[A-Z]/: 1 WTF? $char at line17 is !~ /[A-Z]/: 2 WTF? $char at line17 is !~ /[A-Z]/: 3 WTF? $char at line17 is !~ /[A-Z]/: 4 WTF? $char at line17 is !~ /[A-Z]/: 5 WTF? $char at line17 is !~ /[A-Z]/: 6 WTF? $char at line17 is !~ /[A-Z]/: 7 WTF? $char at line17 is !~ /[A-Z]/: 8 WTF? $char at line17 is !~ /[A-Z]/: 9 WTF? $char at line17 is !~ /[A-Z]/: 0 ENIGMA char found (possibly uc'ed): P ENIGMA char found (possibly uc'ed): Q ENIGMA char found (possibly uc'ed): R WTF? $char at line17 is !~ /[A-Z]/: + WTF? $char at line17 is !~ /[A-Z]/: _ WTF? $char at line17 is !~ /[A-Z]/: ) ... =cut
请注意,如果消息指定加油位置为“73N 39W”,则潜艇艇长会变得无用...