我在64位fedora盒子上使用numpy / pandas,在生产中他们推送到32位Centos盒子并用json.dumps
命中错误。它正在抛出repr(0) is not Serializable
。
我尝试在64位Centos上进行测试,它运行得非常好。但是在32位(准确地说是Centos 6.8)它会抛出一个错误。我想知道是否有人曾经遇到过这个问题。
下面是64位Fedora,
Python 2.6.6 (r266:84292, Jun 30 2016, 09:54:10)
[GCC 5.3.1 20160406 (Red Hat 5.3.1-6)] on linux4
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> >>> a = pd.DataFrame([{'a':1}])
>>>
>>> a
a
0 1
>>> a.to_dict()
{'a': {0: 1}}
>>> import json
>>> json.dumps(a.to_dict())
'{"a": {"0": 1}}'
以下是32位Centos
import json
import pandas as pd
a = pd.DataFrame( [ {'a': 1} ] )
json.dumps(a.to_dict())
Traceback (most recent call last):
File "sample.py", line 5, in <module>
json.dumps(a.to_dict())
File "/usr/lib/python2.6/json/__init__.py", line 230, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python2.6/json/encoder.py", line 367, in encode
chunks = list(self.iterencode(o))
File "/usr/lib/python2.6/json/encoder.py", line 309, in _iterencode
for chunk in self._iterencode_dict(o, markers):
File "/usr/lib/python2.6/json/encoder.py", line 275, in _iterencode_dict
for chunk in self._iterencode(value, markers):
File "/usr/lib/python2.6/json/encoder.py", line 309, in _iterencode
for chunk in self._iterencode_dict(o, markers):
File "/usr/lib/python2.6/json/encoder.py", line 268, in _iterencode_dict
raise TypeError("key {0!r} is not a string".format(key))
TypeError: key 0 is not a string
这个问题的常见工作是什么?我不能使用json的自定义编码器作为我用来推送这个数据的库需要一个字典,它在内部使用json
模块来序列化它并通过网络推送它。
更新:两者兼有的Python版本2.6.6和pandas都是0.16.1
答案 0 :(得分:3)
我相信这是因为索引是与numpy.intNN
不同大小的int
,并且这些索引不会从一个转换为另一个。
就像我的64位Python 2.7和Numpy:
>>> isinstance(numpy.int64(5), int)
True
>>> isinstance(numpy.int32(5), int)
False
然后:
>>> json.dumps({numpy.int64(5): '5'})
'{"5": "5"}'
>>> json.dumps({numpy.int32(5): '5'})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/json/__init__.py", line 243, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
return _iterencode(o, 0)
TypeError: keys must be a string
您可以尝试将索引更改为numpy.int32
,numpy.int64
或int
:
>>> df = pd.DataFrame( [ {'a': 1}, {'a': 2} ] )
>>> df.index = df.index.astype(numpy.int32) # perhaps your index was of these?
>>> json.dumps(df.to_dict())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/json/__init__.py", line 243, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
return _iterencode(o, 0)
TypeError: keys must be a string
因此,您可以尝试将索引类型更改为int32
,int64
或只是简单的Python int
:
>>> df.index = df.index.astype(numpy.int64)
>>> json.dumps(df.to_dict())
'{"a": {"0": 1, "1": 2}}'
>>> df.index = df.index.astype(int)
>>> json.dumps(df.to_dict())
'{"a": {"0": 1, "1": 2}}'