Question

我在Windows上遇到了一个问题，即使用EncodingGroovyMethods#decodeBase64读取和解码编码文件：

getClass().getResourceAsStream('/endoded_file').text.decodeBase64()

这给了我：

base64值中的错误字符

文件本身有CRLF结尾，并且groovy decodeBase64实现片段有一个注释所以：

            } else if (sixBit == 66) {
                // RFC 2045 says that I'm allowed to take the presence of
                // these characters as evidence of data corruption
                // So I will
                throw new RuntimeException("bad character in base64 value"); // TODO: change this exception type
            }

我查了RFC 2045并且CLRF对被认为是合法的。我和org.apache.commons.codec.binary.Base64#decodeBase64尝试过相同，但它确实有效。这是groovy中的错误还是故意的？

我正在使用groovy 2.4.7。

Answer 1

这不是一个错误，而是一种处理损坏数据的不同方式。查看Apache commons中的Base64源代码，您可以看到文档：

 * Ignores all non-base64 characters. This is how chunked (e.g. 76 character) data is handled, since CR and LF are
 * silently ignored, but has implications for other bytes, too. This method subscribes to the garbage-in,
 * garbage-out philosophy: it will not check the provided data for validity.

因此，虽然Apache Base64解码器默默地忽略了损坏的数据，但Groovy将会抱怨它。 RFC文档对此有点模糊：

   In base64 data, characters other than those in Table 1, line breaks, and other
   white space probably indicate a transmission error, about which a
   warning message or even a message rejection might be appropriate
   under some circumstances.

虽然警告消息几乎没用（谁检查警告？），Groovy的作者决定进入“消息拒绝”的道路。

TLDR;它们都很好，只是处理损坏数据的另一种方式。如果可以，请尝试修复或拒绝不正确的数据。

EncodingGroovyMethods和decodeBase64

1 个答案: