尝试处理JSON文件的Unicode问题

时间:2017-12-27 21:03:41

标签: python json unicode

我有一个python脚本,它将JSON写入一个内容如下的文件:

{
    "album": "Night Hawk",
    "album_artist": "Coleman Hawkins with Eddie \u201cLockjaw\u201d Davis",
    "artist": "Coleman Hawkins with Eddie \u201cLockjaw\u201d Davis",
    "bitrate": 744,
    ...
}

将文件上传到服务器并使用以下文件进行处理:

with open(settings.JSON_UPLOAD_DIRECTORY + f.name, 'wb+') as destination:
    for chunk in f.chunks():
        destination.write(chunk)

我的MacOS开发服务器上没有错误。到目前为止,它还可以在我的部署服务器上处理数千个文件。突然之间,我收到了这个错误:

22.     with open(settings.JSON_UPLOAD_DIRECTORY + f.name, 'wb+') as destination:

Exception Value: 'ascii' codec can't encode character '\u201c' in position 81: ordinal not in range(128)

我在这里读过其他关于这个的帖子,但没有理解我做错了什么。我正在运行Python3.6。我的问题是,我是否需要调整打开内存文件进行写入的语句,或者是否存在JSON文件本身编码的问题。

1 个答案:

答案 0 :(得分:0)

我在other answer中找到了灵感。也许它会有所帮助?我相信这个想法是在编写和读取文件时都有一致的编码:

正如你所看到的,我冒昧地增加了一堆更多的问题" Json字符串中的字符("专辑"属性)

let res = match data.as_ref() {
    "aaa" => "this is aaa",
    "bbb" => "this is bbb ",
    //...........


    //aaa some_data_here
    "bbb {data2}" => &format!("this is 'bbb' + some data: {}", data2)

    x => &format!("this is not a pattern I know of, it is {}", x),
};

这似乎输出了一个正确解析的字典:

json_str = """{
    "album": "Ñíght Håwk 你好",
    "album_artist": "Coleman Hawkins with Eddie \u201cLockjaw\u201d Davis",
    "artist": "Coleman Hawkins with Eddie \u201cLockjaw\u201d Davis",
    "bitrate": 744
}"""


import json
import tempfile
import os

print(json.loads(json_str))  # Just double checking
path = os.path.join(tempfile.gettempdir(), 'foo.txt')

with open(path, 'w+', encoding='utf-8') as destination:
    # The encoding= is the important part
    destination.write(json_str)

with open(path, 'r+', encoding='utf-8') as source:
    # The encoding= is the important part
    print(json.loads(source.read()))

但是,此输出还取决于终端的配置,因此我不能100%确定它在您的情况下是否有效。我使用的是Python 3.5.1对于Python 2.6.x,您应该使用io