这是my previous question on showing unicode string differences的后续行动。事实证明,字符串似乎是相同的,但是在其中一个字符串中,UTF8标志处于打开状态。
SV = PVMG(0x4cca750) at 0x4b3fc90
REFCNT = 1
FLAGS = (PADMY,POK,pPOK,UTF8)
IV = 0
NV = 0
PV = 0x1eda410 "flurbe"\0 [UTF8 "flurbe"]
CUR = 6
LEN = 16
VS
SV = PV(0xf28090) at 0xf4b6a0
REFCNT = 1
FLAGS = (PADMY,POK,pPOK)
PV = 0xf37b90 "flurbe"\0
CUR = 6
LEN = 16
这似乎会在我加密字符串时产生的sha512哈希值之间产生差异。据我所知,舞者是第一个产生utf8的结果,我的另一个脚本只是一个命令行,没有使用舞者,我怎么能强迫它以同样的方式行事呢?
答案 0 :(得分:5)
(这更多的是评论而不是答案,但它太大了。)
我刚刚运行了这个程序:
#!/usr/bin/perl -w
use warnings;
use strict;
use Devel::Peek ();
use Digest::SHA ();
my $x = 'flurbe';
Devel::Peek::Dump $x;
print Digest::SHA::sha512_hex($x), "\n\n";
utf8::upgrade $x;
Devel::Peek::Dump $x;
print Digest::SHA::sha512_hex($x), "\n";
__END__
它给出了这个输出:
SV = PV(0x10441040) at 0x10491638
REFCNT = 1
FLAGS = (PADMY,POK,pPOK)
PV = 0x10449ca0 "flurbe"\0
CUR = 6
LEN = 8
1cd2e71e55653caeb6c9bffa47a66ff1c9b526bbb732dcff28412090601e9b5e34d36be6a0267527347cd94039b383d4bc45653d786d1041debe7faa0716bdf1
SV = PV(0x10441040) at 0x10491638
REFCNT = 1
FLAGS = (PADMY,POK,pPOK,UTF8)
PV = 0x10449ca0 "flurbe"\0 [UTF8 "flurbe"]
CUR = 6
LEN = 8
1cd2e71e55653caeb6c9bffa47a66ff1c9b526bbb732dcff28412090601e9b5e34d36be6a0267527347cd94039b383d4bc45653d786d1041debe7faa0716bdf1
如您所见,Devel::Peek::Dump
正确识别字符串已升级为UTF-8,但这不会影响Digest::SHA
计算的SHA-512哈希值。
编辑添加:在上面的评论中,您提到您的“哈希是随机腌制的”。这些盐是否可以包含ASCII范围之外的字节?如果是这样,与UTF-8升级的字符串串联可能会影响其内容。我刚刚运行了这个修改过的程序:
#!/usr/bin/perl -w
use warnings;
use strict;
use Devel::Peek ();
use Digest::SHA ();
my $x = 'flurbe';
my $y = "\xA0"; # a single byte, hex 00A0
my $z = "\xC2\xA0"; # UTF-8 representation of U+00A0, as a byte-string
Devel::Peek::Dump "$x$y";
print Digest::SHA::sha512_hex("$x$y"), "\n\n";
Devel::Peek::Dump "$x$z";
print Digest::SHA::sha512_hex("$x$z"), "\n\n";
utf8::upgrade $x;
Devel::Peek::Dump "$x$y";
print Digest::SHA::sha512_hex("$x$y"), "\n";
__END__
它给出了这个输出:
SV = PV(0x104410e8) at 0x104d68d8
REFCNT = 1
FLAGS = (PADTMP,POK,pPOK)
PV = 0x10449ca0 "flurbe\240"\0
CUR = 7
LEN = 8
1901f989ed76143697ecc6683fd03ec793bc126d51cdbee0a72241933136c144f2e602828abddc7e4843df5542a099be92313fa5874d1d2dc54ecdd1ff308c5e
SV = PV(0x104d80b8) at 0x104ec098
REFCNT = 1
FLAGS = (PADTMP,POK,pPOK)
PV = 0x10489170 "flurbe\302\240"\0
CUR = 8
LEN = 12
072f7b54c80fa8062ca1d17727a88c9ff4815f83c1166471331c6398b9140a06812eff341c98453f4c51356926dbe9694cbcbebfe4cda7e77cf68008ab838c6d
SV = PV(0x104d80a8) at 0x104f0f98
REFCNT = 1
FLAGS = (PADTMP,POK,pPOK,UTF8)
PV = 0x104896c8 "flurbe\302\240"\0 [UTF8 "flurbe\x{a0}"]
CUR = 8
LEN = 12
072f7b54c80fa8062ca1d17727a88c9ff4815f83c1166471331c6398b9140a06812eff341c98453f4c51356926dbe9694cbcbebfe4cda7e77cf68008ab838c6d
如您所见,"$x$y"
的SHA-512哈希取决于$x
是否经过UTF-8升级。使用UTF-8升级版"$x$y"
的{{1}}提供与$x
相同的SHA-512哈希值,其中非 -UTF-8已升级{{1} }}。这是因为SHA-512对字节而不是字符进行操作,并且UTF-8升级后的字符串与字节串的串联导致字节串被UTF-8升级。
答案 1 :(得分:1)
您有编码问题,即缺少编码问题。摘要功能在八位字节上运行。你给它字符,这是错误的。
行动方案:将您的角色编码为八位字节。 UTF-8是一种合适的编码。
my $octets = Encode::encode('UTF-8', $characters, Encode::FB_CROAK);
# add salt to octets
# produce digest