Question

我有一个包含非ascii字符的JSON文件，主要是西班牙语强调的元音。我已将此文件的编码设置为utf-8（使用vim和：set fileencoding = utf8）。

此文件的摘录，可用于参考：

[
  {
    "location": "SRID=4326;POINT(-1.7944444440000000 43.3798499999999976)",
    "description": "",
    "name": "Fuenterrabía",
    "_id": 162
  },
...
]

注意名称字段中的'í'。

那就是说，我需要在我的代码中创建这个元素的迭代，所以最后我将它发送到FactoryBoy Factory类的create方法。事情是，在这个过程中，角色搞砸了，我得到了不可思议的：

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 143: ordinal not in range(128)

查看解析文件的代码：

    def _get_spot_data(self, filename):
        data_file = str(settings.ROOT_DIR('fixtures/meteo/'+filename))
        # Open file with utf-8 encoding
        f = codecs.open(data_file, 'r', 'utf8')
        data = simplejson.loads(f.read(), 'utf-8')
        return data

查看列表的第一个位置（用于参考的位置），我得到了这个：

{u'_id': 162, u'location': u'SRID=4326;POINT(-1.7944444440000000 43.3798499999999976)', u'name': u'Fuenterrab\xeda', u'description': u''}

正如您所看到的，看起来我正在获得'í'字符的字节表示。

我尝试过只使用open而不是codecs.open，尝试使用json.loads而不是simplejson，但没有任何效果......我在这里做错了什么???

编辑：尝试创建一个虚拟条目，看看我是否收到了相同的错误...因此我猜想FactorBoy上的某些内容出错：

dummy_one = {u'surfspot_id': 162, u'location': u'SRID=4326;POINT(-1.7944444440000000 43.3798499999999976)', u'name': u'Fuenterrabía', u'description': u''}
SpotFactory(**dummy_one)

错误再次被抛出......

Traceback (most recent call last):
  File "/usr/lib/python2.7/logging/handlers.py", line 76, in emit
    if self.shouldRollover(record):
  File "/usr/lib/python2.7/logging/handlers.py", line 156, in shouldRollover
    msg = "%s\n" % self.format(record)
  File "/usr/lib/python2.7/logging/__init__.py", line 732, in format
    return fmt.format(record)
  File "/usr/lib/python2.7/logging/__init__.py", line 474, in format
    s = self._fmt % record.__dict__
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 143: ordinal not in range(128)
Logged from file base.py, line 397

我必须说这不是阻塞问题，只是一个警告异常，但我不喜欢看它，希望输出干净

谢谢！

Answer 1

这很有效：

public void deepChangeTextColor(ViewGroup parentLayout){
    for (int count=0; count < parentLayout.getChildCount(); count++){
          View view = parentLayout.getChildAt(count);
          if(view instanceof TextView){
              ((TextView)view).setTextColor(...);
          } else if(view instanceof ViewGroup){
              deepChangeTextColor((ViewGroup)view);
          }
    }
}

打印：

Fuenterrabía
12
í

但是，这个

import json

with open(r'your/data.json') as f:
    data = json.load(f, encoding='utf-8')
    name = data[0]['name']

    print name
    print len(name)
    print name[-2:-1]

仍然打印为：

{
    u'_id': 162, 
    u'location': u'SRID=4326;POINT(-1.7944444440000000 43.3798499999999976)',
    u'name': u'Fuenterrab\xeda', 
    u'description': u''
}

不要让自己被这个输出所迷惑，字符串本身就是Unicode并包含正确的字符。

带有json.loads的JSON文件的奇怪编码问题

1 个答案: