我试图通过bash命令验证pdf文件的完整性。
使用dd我提取了pdf的signedContent和pkcs7分离对象。
然后我通过
解码了pkcsxxd -r -p pkcs7_extracted > pkcs7_extracted.bin
openssl asn1parse -inform DER <pkcs7_extracted.bin >pkcs7_extracted_decoded
从解码的pkcs7我得到了一些有用的信息
0:d=0 hl=4 l=5498 cons: SEQUENCE
4:d=1 hl=2 l= 9 prim: OBJECT :pkcs7-signedData
15:d=1 hl=4 l=5483 cons: cont [ 0 ]
19:d=2 hl=4 l=5479 cons: SEQUENCE
23:d=3 hl=2 l= 1 prim: INTEGER :01
26:d=3 hl=2 l= 15 cons: SET
28:d=4 hl=2 l= 13 cons: SEQUENCE
30:d=5 hl=2 l= 9 prim: OBJECT :sha256
41:d=5 hl=2 l= 0 prim: NULL
43:d=3 hl=2 l= 11 cons: SEQUENCE
...
5154:d=7 hl=2 l= 9 prim: OBJECT :contentType
5165:d=7 hl=2 l= 11 cons: SET
5167:d=8 hl=2 l= 9 prim: OBJECT :pkcs7-data
5178:d=6 hl=2 l= 47 cons: SEQUENCE
5180:d=7 hl=2 l= 9 prim: OBJECT :messageDigest
5191:d=7 hl=2 l= 34 cons: SET
5193:d=8 hl=2 l= 32 prim: OCTET STRING [HEX DUMP]:18B399D208A08815DDF23C93B1B63B13757A6AA24B1932569D7A69D0DB3A34C2
5227:d=5 hl=2 l= 13 cons: SEQUENCE
5229:d=6 hl=2 l= 9 prim: OBJECT :sha256WithRSAEncryption
5240:d=6 hl=2 l= 0 prim: NULL
5242:d=5 hl=4 l= 256 prim: OCTET STRING [HEX DUMP]:8F4B21914173EC57E6B0533BB5E04FB7054F23AC299C1BDBF589ED164A3EABB611727BE9117AAC3161D9C18DCA08BD113DD3AA90E5922009FA12BA59E7F6587E81CD79BDED09F862C2C76F35D950926F1A31A3DCCE999A52DCE0C7F67D081E81A44397E8AF96A1051B8E51F2E2271221B06D05C9895E1846B1DBE02B558F5B9EF97C7EB0FF9A7C71A9764D5E205900818F07E82027D79D3F9A5AA72B3A0CF131F1B890D0BCBF3C4DD8A0229FABE15F6C2CA0CE079EB925B3998A1A6190596A88D8F07C1C12B8750636E69108E30E643A653B285A400080C9C5590C112451F6D69BAFC2686D6F1107B37A5DB36B9F797C49E61D4B44E62E17DD541778DE763AC5
5502:d=0 hl=2 l= 0 prim: EOC
特别是我注意到messageDigest字段等于使用ByteRange获得的signedContent的计算摘要。
我已经提取了加密的哈希值,用我的公钥解密,然后用asn1命令再次解码。
dd if=pkcs7_extracted.bin of=extracted.sign.bin bs=1 skip=$[ 5242 + 4 ] count=256
#decrypt
openssl rsautl -verify -pubin -inkey publickey.pem < extracted.sign.bin > verified.bin
#decode of result
openssl asn1parse -inform der -in verified.bin
结果就是这个对象
0:d=0 hl=2 l= 49 cons: SEQUENCE
2:d=1 hl=2 l= 13 cons: SEQUENCE
4:d=2 hl=2 l= 9 prim: OBJECT :sha256
15:d=2 hl=2 l= 0 prim: NULL
17:d=1 hl=2 l= 32 prim: OCTET STRING [HEX DUMP]:EBAA31519CD0CCA793FEC34AA6BDD8DFA5E4D5F63BA4711F6C8ECE5D20FEF393
我非常确定解密是有效的,因为对象被正确解码并且我预期包含sha256对象,但正如您所看到的,摘要值不同 ......
我在寻找错误的地方吗?我不知道如何验证完整性。此外,Acrobat当然会验证此签名文档的完整性。
提前感谢!
答案 0 :(得分:0)
请注意,在SignedData
对象中,需要考虑多个哈希值,这些哈希值通常不相等。
在RFC 3852中查看加密消息语法(CMS)对象的定义。
(RFC 3852是从当前PDF规范ISO 32000-1引用的RFC;因此,即使它被 RFC 5652 废弃,新RFC中的更改也可能不适用于此背景。)
SignedData ::= SEQUENCE {
version CMSVersion,
digestAlgorithms DigestAlgorithmIdentifiers,
encapContentInfo EncapsulatedContentInfo,
certificates [0] IMPLICIT CertificateSet OPTIONAL,
crls [1] IMPLICIT RevocationInfoChoices OPTIONAL,
signerInfos SignerInfos }
...
SignerInfo ::= SEQUENCE {
version CMSVersion,
sid SignerIdentifier,
digestAlgorithm DigestAlgorithmIdentifier,
signedAttrs [0] IMPLICIT SignedAttributes OPTIONAL,
signatureAlgorithm SignatureAlgorithmIdentifier,
signature SignatureValue,
unsignedAttrs [1] IMPLICIT UnsignedAttributes OPTIONAL }
...
SignedAttributes ::= SET SIZE (1..MAX) OF Attribute
...
signedAttrs is a collection of attributes that are signed. The
field is optional, but it MUST be present if the content type of
the EncapsulatedContentInfo value being signed is not id-data.
SignedAttributes MUST be DER encoded, even if the rest of the
structure is BER encoded. Useful attribute types, such as signing
time, are defined in Section 11. If the field is present, it MUST
contain, at a minimum, the following two attributes:
A content-type attribute having as its value the content type
of the EncapsulatedContentInfo value being signed. Section
11.1 defines the content-type attribute. However, the
content-type attribute MUST NOT be used as part of a
countersignature unsigned attribute as defined in section 11.4.
A message-digest attribute, having as its value the message
digest of the content. Section 11.2 defines the message-digest
attribute.
...
The result of the message digest calculation process depends on
whether the signedAttrs field is present. When the field is absent,
the result is just the message digest of the content as described
above. When the field is present, however, the result is the message
digest of the complete DER encoding of the SignedAttrs value
contained in the signedAttrs field. Since the SignedAttrs value,
when present, must contain the content-type and the message-digest
attributes, those values are indirectly included in the result.
因此,你的观察
messageDigest字段等于使用ByteRange获得的signedContent的计算摘要。
5178:d=6 hl=2 l= 47 cons: SEQUENCE
5180:d=7 hl=2 l= 9 prim: OBJECT :messageDigest
5191:d=7 hl=2 l= 34 cons: SET
5193:d=8 hl=2 l= 32 prim: OCTET STRING [HEX DUMP]:18B399D208A08815DDF23C93B1B63B13757A6AA24B1932569D7A69D0DB3A34C2
表示正确的数据已签名,因为 message-digest属性的值应为内容的消息摘要。
但是你也可以在这里阅读,实际内部签名字节(你解密的)签名的数据不是内容的消息摘要而是 signedAttrs属性集合!
因此,您不得针对内容哈希验证这些签名字节,而是针对签名属性哈希,如RFC。强>
PS:OP同时在CMS签名数据验证主题上找到了this other answer,其中还说明了如何更加图形化地识别哪些属性已签名,哪些属性未签名。
PPS:OP通过解密签名字节进行验证,提取包含的哈希值,并将其与实际哈希值进行比较。这适用于基于RSA的签名。但是,基于DSA或ECDSA的签名无法解密,因此无法提取哈希值。必须使用特殊验证程序进行验证。
PPPS:有不同风格的集成PDF签名。虽然这里使用的样式(PKCS7 / CAdES分离)是最常见和推荐的样式,但在通用解决方案中,必须事先检查并相应地进行验证。