如何通过python解码unicode字符?

时间:2015-08-09 09:44:04

标签: python json python-2.7 unicode

我正在尝试使用python导入以下json文件:

该文件名为new_json.json

{ "nextForwardToken": "f/3208873243596875673623625618474139659", "events": [ { "ingestionTime": 1045619, "timestamp": 1909000, "message": "2 32823453119 eni-889995t1 54.25.64.23 156.43.12.120 3389 23 6 342 24908 143234809 983246 ACCEPT OK" }] }

我有以下代码来读取json文件,并删除unicode字符:

JSON_FILE = "new_json.json"
with open(JSON_FILE) as infile:
    print infile
    print '\n type of infile is \n', infile
    data = json.load(infile)
    str_data = str(data)  # convert to string to remove unicode characters
    wo_unicode = str_data.decode('unicode_escape').encode('ascii','ignore')
    print 'unicode characters have been removed \n'
    print wo_unicode

print wo_unicode仍然打印出unicode字符(即u)。

尝试将json视为字典时,unicode字符会导致问题:

for item in data:
    iden = item.get['nextForwardToken']

...导致错误:

AttributeError: 'unicode' object has no attribute 'get'

这必须在Python2.7中运行。有一个简单的方法吗?

2 个答案:

答案 0 :(得分:1)

该错误与unicode无关,您试图将密钥视为dicts,只需使用data获取'nextForwardToken'

print data.get('nextForwardToken')

当您遍历data时,您正在迭代密钥,因此即使使用正确的语法,'nextForwardToken'.get('nextForwardToken')"events".get('nextForwardToken')等显然也无法正常工作。

无论您是data.get(u'nextForwardToken')还是data.get('nextForwardToken'),都会返回密钥的值:

In [9]: 'nextForwardToken' == u'nextForwardToken'
Out[9]: True
In [10]: data[u'nextForwardToken']
Out[10]: u'f/3208873243596875673623625618474139659'   
In [11]: data['nextForwardToken']
Out[11]: u'f/3208873243596875673623625618474139659'

答案 1 :(得分:0)

此代码将为您提供没有unicode

的str值
{{1}}