Question

可能重复：
How to get string Objects instead Unicode ones from JSON in Python?

我有很多输入作为从JSON API调用解析的多级字典。字符串都是unicode，这意味着有很多u'stuff like this'。我正在使用jq来处理结果，需要将这些结果转换为ASCII。

我知道我可以编写一个函数来转换它：

def convert(input):
    if isinstance(input, dict):
        ret = {}
        for stuff in input:
            ret = convert(stuff)
    elif isinstance(input, list):
        ret = []
        for i in range(len(input))
            ret = convert(input[i])
    elif isinstance(input, str):
        ret = input.encode('ascii')
    elif :
        ret = input
    return ret

这是否正确？不确定。这不是我想问你的。

我要问的是，这是解决问题的典型蛮力解决方案。肯定有更好的办法。一种更加pythonic的方式。我不是算法方面的专家，但这个算法看起来也不是特别快。

那么还有更好的方法吗？或者如果没有，可以改进这个功能......？

回答后修改

Mark Amery's answer是正确的，但我想发布它的修改版本。他的函数适用于Python 2.7+，我在2.6上，所以不得不转换它：

def convert(input):
    if isinstance(input, dict):
        return dict((convert(key), convert(value)) for key, value in input.iteritems())
    elif isinstance(input, list):
        return [convert(element) for element in input]
    elif isinstance(input, unicode):
        return input.encode('utf-8')
    else:
        return input

Answer 1

递归似乎是去这里的方式，但是如果你在python 2.xx上，你想要检查unicode，而不是str（str类型代表一个字符串，unicode类型是一个unicode字符串;它们都不是从另一个字符串继承而来的，它是unicode类型的字符串，它们在解释器中显示，其前面是au）。

在您发布的代码中也有一点语法错误（尾部elif:应该是else），并且在输入是字典或者字典的情况下，您不会返回相同的结构一个列表。（在字典的情况下，您将返回最终键的转换版本;如果是列表，则返回最终元素的转换版本。两者都不对！）

你也可以通过使用理解来使你的代码变得漂亮和Pythonic。

然后，这就是我的建议：

def convert(input):
    if isinstance(input, dict):
        return {convert(key): convert(value) for key, value in input.iteritems()}
    elif isinstance(input, list):
        return [convert(element) for element in input]
    elif isinstance(input, unicode):
        return input.encode('utf-8')
    else:
        return input

最后一件事。我将encode('ascii')更改为encode('utf-8')。我的推理如下：任何只包含ASCII字符集中字符的unicode字符串在用ASCII编码时用相同的字节字符串表示，就像在utf-8中编码一样，所以使用utf-8代替ASCII不能破坏任何东西只要您处理的unicode字符串仅使用ASCII字符，更改将不可见。但是，这种更改扩展了函数的范围，使得能够处理整个unicode字符集中的字符串，而不仅仅是ASCII字符串，如果有必要的话。

Python：将复杂的字符串字典从Unicode转换为ASCII

1 个答案: