Question

我最近遇到了the following code sample来加密带有AES-256 CBC和SHA-256 HMAC的文件进行身份验证和验证：

aes_key, hmac_key = self.keys
# create a PKCS#7 pad to get us to `len(data) % 16 == 0`
pad_length = 16 - len(data) % 16
data = data + (pad_length * chr(pad_length))
# get IV
iv = os.urandom(16)
# create cipher
cipher = AES.new(aes_key, AES.MODE_CBC, iv)
data = iv + cipher.encrypt(data)
sig = hmac.new(hmac_key, data, hashlib.sha256).digest()
# return the encrypted data (iv, followed by encrypted data, followed by hmac sig):
return data + sig

因为在我的情况下，我加密的不仅仅是一个字符串，而是一个相当大的文件，我修改了代码来执行以下操作：

aes_key, hmac_key = self.keys
iv = os.urandom(16)
cipher = AES.new(aes_key, AES.MODE_CBC, iv)

with open('input.file', 'rb') as infile:
    with open('output.file', 'wb') as outfile:
        # write the iv to the file:
        outfile.write(iv)

        # start the loop
        end_of_line = True

        while True:
            input_chunk = infile.read(64 * 1024)

            if len(input_chunk) == 0:
                # we have reached the end of the input file and it matches `% 16 == 0`
                # so pad it with 16 bytes of PKCS#7 padding:
                end_of_line = True
                input_chunk += 16 * chr(16)
            elif len(input_chunk) % 16 > 0:
                # we have reached the end of the input file and it doesn't match `% 16 == 0`
                # pad it by the remainder of bytes in PKCS#7:
                end_of_line = True
                input_chunk_remainder = 16 - (len(input_chunk) & 16)
                input_chunk += input_chunk_remainder * chr(input_chunk_remainder)

            # write out encrypted data and an HMAC of the block
            outfile.write(cipher.encrypt(input_chunk) + hmac.new(hmac_key, data, 
                    hashlib.sha256).digest())

            if end_of_line:
                break

简单地说，它一次以64KB的块读取输入文件并加密这些块，使用加密数据的SHA-256生成HMAC，并在每个块之后附加HMAC。解读将通过读取64KB + 32B块并计算前64KB的HMAC并将其与占用块中最后32个字节的SHA-256总和进行比较。

这是使用HMAC的正确方法吗？它是否确保数据未经修改并使用正确密钥解密的安全性和身份验证？

仅供参考，AES和HMAC密钥都来自相同的密码短语，这是通过SHA-512运行输入文本，然后通过bcrypt，然后再通过SHA-512生成的。然后将最终SHA-512的输出分成两个块，一个用于AES密码，另一个用于HMAC。

Answer 1

是的，有2个安全问题。

但首先，我假设最后这句话：

# write out encrypted data and an HMAC of the block
outfile.write(cipher.encrypt(input_chunk) + hmac.new(hmac_key, data, hashlib.sha256).digest())

你的意思是：

# write out encrypted data and an HMAC of the block
data = cipher.encrypt(input_chunk)
outfile.write(data + hmac.new(hmac_key, data, hashlib.sha256).digest())

因为data未在任何地方定义。

第一个安全问题是您要独立于其他部分验证每个部分，而不是组成。换句话说，攻击者可以重新洗牌，复制或删除任何块，接收者不会注意到。

更安全的方法是只有一个HMAC实例，通过update方法将所有加密数据传递给它，并在最后输出一个摘要。

或者，如果要在接收整个文件之前启用接收器检测篡改，则可以为每个片段输出中间MAC。实际上，调用digest并不会改变HMAC的状态;你可以继续打电话给update。

第二个安全问题是你不使用salt来进行密钥派生（我说是因为你没有发送它）。除了密码破解之外，如果使用相同的密码加密2个以上的文件，攻击者也可以自由地混合加密文件所占用的块 - 因为HMAC密钥是相同的。解决方案：使用盐。

最后一件小事：infile.read(64 * 1024)可能会返回少于64*1024个字节，但that does not mean you reached the end of the file。

Answer 2

我认为您对HMAC所做的事情不存在安全问题（并不意味着安全性没有问题），但我不知道HMAC sub中的实际值密文的元素可以帮到你。除非你想在篡改的情况下支持明文的部分恢复，否则没有太多理由引起HMACing 64 KB块的开销，而不是完整的密文。

从密钥生成的角度来看，使用密码生成的密钥加密两个随机生成的密钥，然后使用随机生成的密钥执行HMAC和AES操作可能更有意义。我知道对你的分组密码和HMAC使用相同的密钥是坏消息，但我不知道使用以相同方式生成的密钥是否同样糟糕。

至少，您应该调整密钥派生机制。 bcrypt是密码散列机制，而不是密钥派生函数。您应该使用PBKDF2来进行密钥派生。

HMAC-SHA256采用CBC模式的AES-256

2 个答案: