在JSON中序列化base64编码数据

时间:2016-05-14 09:35:16

标签: json python-3.x serialization base64

我正在编写脚本来自动生成演示数据,我需要在JSON中序列化一些数据。这个数据的一部分是一个图像,所以我用base64编码,但是当我尝试运行我的脚本时,我得到了:

Traceback (most recent call last):
  File "lazyAutomationScript.py", line 113, in <module>
    json.dump(out_dict, outfile)
  File "/usr/lib/python3.4/json/__init__.py", line 178, in dump
    for chunk in iterable:
  File "/usr/lib/python3.4/json/encoder.py", line 422, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/usr/lib/python3.4/json/encoder.py", line 396, in _iterencode_dict
    yield from chunks
  File "/usr/lib/python3.4/json/encoder.py", line 396, in _iterencode_dict
    yield from chunks
  File "/usr/lib/python3.4/json/encoder.py", line 429, in _iterencode
    o = _default(o)
  File "/usr/lib/python3.4/json/encoder.py", line 173, in default
    raise TypeError(repr(o) + " is not JSON serializable")
  TypeError: b'iVBORw0KGgoAAAANSUhEUgAADWcAABRACAYAAABf7ZytAAAABGdB...
     ...
   BF2jhLaJNmRwAAAAAElFTkSuQmCC' is not JSON serializable

据我所知,base64编码的任何东西(在这种情况下是一个PNG图像)只是一个字符串,所以它应该导致问题到序列化。我错过了什么?

3 个答案:

答案 0 :(得分:47)

您必须注意数据类型。

如果您读取二进制图像,则会获得字节。 如果你在base64中编码这些字节,你会再次得到......字节! (参见b64encode

上的文档

json无法处理原始字节,这就是您收到错误的原因。

我刚刚写了一些例子,有评论,我希望它有所帮助:

from base64 import b64encode
from json import dumps

ENCODING = 'utf-8'
IMAGE_NAME = 'spam.jpg'
JSON_NAME = 'output.json'

# first: reading the binary stuff
# note the 'rb' flag
# result: bytes
with open(IMAGE_NAME, 'rb') as open_file:
    byte_content = open_file.read()

# second: base64 encode read data
# result: bytes (again)
base64_bytes = b64encode(byte_content)

# third: decode these bytes to text
# result: string (in utf-8)
base64_string = base64_bytes.decode(ENCODING)

# optional: doing stuff with the data
# result here: some dict
raw_data = {IMAGE_NAME: base64_string}

# now: encoding the data to json
# result: string
json_data = dumps(raw_data, indent=2)

# finally: writing the json string to disk
# note the 'w' flag, no 'b' needed as we deal with text here
with open(JSON_NAME, 'w') as another_open_file:
    another_open_file.write(json_data)

答案 1 :(得分:2)

另一种解决方案是使用自定义编码器动态编码内容:

import json
from base64 import b64encode

class Base64Encoder(json.JSONEncoder):
    # pylint: disable=method-hidden
    def default(self, o):
        if isinstance(o, bytes):
            return b64encode(o).decode()
        return json.JSONEncoder.default(self, o)

有了定义,您可以执行以下操作:

m = {'key': b'\x9c\x13\xff\x00'}
json.dumps(m, cls=Base64Encoder)

它将产生:

'{"key": "nBP/AA=="}'

答案 2 :(得分:0)

  

我想念什么?

大喊大叫Boolean不可JSON序列化。

binary

后者绝对是“ JSON可序列化的”,因为它是二进制from base64 import b64encode # *binary representation* of the base64 string assert b64encode(b"binary content") == b'YmluYXJ5IGNvbnRlbnQ=' # base64 string assert b64encode(b"binary content").decode('utf-8') == 'YmluYXJ5IGNvbnRlbnQ=' 的base64字符串表示形式。