我有嵌套字典,其中包含一些数字键。我需要将这个字典存储为JSON,因为这些键是数字的,所以我不能将它们存储为JSON。我编写了下面的代码,但它给出了错误,说明字典的长度已经改变(RuntimeError: dictionary changed size during iteration
)。
def convert_to_str(dictionary):
for key in dictionary:
print (key)
found = False
non_str_keys = []
if not isinstance(key, str):
print(key, 'is not a string')
dictionary[str(key)] = dictionary[key]
non_str_keys.append(key)
if isinstance(dictionary[str(key)], dict):
dictionary[str(key)] = convert_to_str(dictionary[str(key)])
non_str_keys.append(key)
if non_str_keys:
for each_non_str_key in non_str_keys:
del dictionary[each_non_str_key]
return dictionary
我该如何避免这种情况?我有的字典是 -
a = {
"age": {
1: 25.0,
2: 50.25,
3: 50.0,
4: 75.0,
5: 14.580906789680968,
6: [
25.0,
30.0,
34.800000000000004,
40.0,
46.60000000000001,
50.0,
56.0,
61.0,
65.0,
69.0,
75.0
],
"quartiles": [
38.0,
64.0
],
"decile_event_rate": [
0.8125,
0.7142857142857143,
0.65625,
0.42857142857142855,
0.45161290322580644,
0.4857142857142857,
0.5925925925925926,
0.5,
0.5142857142857142,
0.375
]
},
"income": {
"min": 10198.0,
"mean": 55621.78666666667,
"median": 52880.0,
"max": 99783.0,
"std": 24846.911384024643,
"deciles": [
10198.0,
25269.4,
31325.800000000003,
37857.0,
43721.8,
52880.0,
63996.0,
72526.9,
82388.2,
89765.90000000001,
99783.0
],
"quartiles": [
35088.5,
78687.25
],
"decile_event_rate": [
0.6666666666666666,
0.6,
0.5333333333333333,
0.5666666666666667,
0.5,
0.6451612903225806,
0.4827586206896552,
0.5,
0.5666666666666667,
0.5
]
},
"edu_yrs": {
"min": 0.0,
"mean": 12.73,
"median": 13.0,
"max": 25.0,
"std": 7.86234623342895,
"deciles": [
0.0,
2.0,
4.0,
7.0,
9.600000000000009,
13.0,
16.0,
18.0,
21.200000000000017,
23.0,
25.0
],
"quartiles": [
6.0,
20.0
],
"decile_event_rate": [
0.5384615384615384,
0.6521739130434783,
0.5151515151515151,
0.48,
0.6111111111111112,
0.5,
0.5,
0.6071428571428571,
0.5151515151515151,
0.6666666666666666
]
},
"yrs_since_exercise": {
"min": 0.0,
"mean": 18.566666666666666,
"median": 16.0,
"max": 60.0,
"std": 14.417527732194037,
"deciles": [
0.0,
3.0,
5.0,
8.0,
12.0,
16.0,
20.0,
25.0,
31.0,
41.0,
60.0
],
"quartiles": [
6.0,
27.0
],
"decile_event_rate": [
1.0,
1.0,
1.0,
0.9629629629629629,
0.75,
0.4857142857142857,
0.15384615384615385,
0.06666666666666667,
0.0,
0.0
]
},
"security_label": {
"event_rate": {
"A": {
"1.0": 0.6,
"0.0": 0.4
},
"B": {
"1.0": 0.57,
"0.0": 0.43
},
"C": {
"0.0": 0.5,
"1.0": 0.5
}
},
"freq": {
"A": 100,
"B": 100,
"C": 100
},
"var_type": "categorical"
}
}
修改
json.dump(self.entity_data, open(path, 'w'), indent=2, cls=CustomEncoder)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/__init__.py", line 179, in dump
for chunk in iterable:
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 430, in _iterencode
yield from _iterencode_dict(o, _current_indent_level)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict
yield from chunks
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict
yield from chunks
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict
yield from chunks
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 376, in _iterencode_dict
raise TypeError("key " + repr(key) + " is not a string")
TypeError: key 0 is not a string
的图片
修改-2
我在使用numpy对象时遇到了序列化错误。所以我开始使用这个编码器将它们转换为python对象。
class CustomEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, np.integer):
return int(obj)
elif isinstance(obj, np.floating):
return float(obj)
elif isinstance(obj, np.ndarray):
return obj.tolist()
else:
return super(CustomEncoder, self).default(obj)
我在使用json.dump
时一直在使用cls = CustomEncoder
。这是我用过的命令
json.dump(self.entity_data, open(path, 'w'), indent=2, cls=CustomEncoder)
答案 0 :(得分:4)
你需要递归转换所有键;生成一个带有字典理解的 new 字典,这比在原地修改密钥要容易得多。您无法添加字符串键并删除正在迭代的字典中的非字符串键,因为这会改变哈希表,这很容易改变列出字典键的顺序,因此不允许这样做。
你不应该忘记处理清单;它们也可以包含更多的词典。
每当我需要转换这样的嵌套结构时,我会使用@functools.singledispatch
decorator将不同容器类型的处理拆分为不同的函数:
from functools import singledispatch
@singledispatch
def keys_to_strings(ob):
return ob
@keys_to_strings.register(dict):
def _handle_dict(ob):
return {str(k): keys_to_strings(v) for k, v in ob.items()}
@keys_to_strings.register(list):
def _handle_list(ob):
return [keys_to_strings(v) for v in ob]
然后JSON编码keys_to_string()
:
json.dumps(keys_to_string(a))
并非所有这一切都需要。 json.dumps()
本机接受整数键,将它们转换为字符串。您的输入示例无需转换即:
json.dumps(a)
注意: JSON的键/值对中的键始终为
str
类型。当字典转换为JSON时,字典的所有键都被强制转换为字符串。因此,如果将字典转换为JSON然后再转换为字典,则字典可能与原始字典不同。也就是说,如果loads(dumps(x)) != x
具有非字符串键,则为x
。
这仅适用于 JSON否则可以处理的类型,因此None
,布尔值,float
和int
个对象。对于其他任何事情,你仍然会得到你的例外。您可能有一个表示为0
的对象,但它不是Python int
0:
>>> json.dumps({0: 'works'})
'{"0": "works"}'
>>> import numpy
>>> numpy.int32()
0
>>> json.dumps({numpy.int32(): 'fails'})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
TypeError: keys must be a string
我选择了numpy
整数类型,因为这是一个常见的混淆整数值,不是Python int
。
您添加到帖子中的自定义编码器不会用于密钥;这只适用于字典中的 values ,所以如果你有非标准的密钥对象,那么你确实仍然需要使用上面的递归解决方案。
答案 1 :(得分:0)
json.dumps
自动将整数索引转换为字符串索引
>>> import json
>>> a = {'income': {'deciles': [10198.0, 25269.4, 31325.800000000003, 37857.0, 43721.8, 52880.0, 63996.0, 72526.9, 82388.2, 89765.90000000001, 99783.0], 'min': 10198.0, 'std': 24846.911384024643, 'quartiles': [35088.5, 78687.25], 'median': 52880.0, 'decile_event_rate': [0.6666666666666666, 0.6, 0.5333333333333333, 0.5666666666666667, 0.5, 0.6451612903225806, 0.4827586206896552, 0.5, 0.5666666666666667, 0.5], 'max': 99783.0, 'mean': 55621.78666666667}, 'age': {1: 25.0, 2: 50.25, 3: 50.0, 4: 75.0, 5: 14.580906789680968, 6: [25.0, 30.0, 34.800000000000004, 40.0, 46.60000000000001, 50.0, 56.0, 61.0, 65.0, 69.0, 75.0], 'quartiles': [38.0, 64.0], 'decile_event_rate': [0.8125, 0.7142857142857143, 0.65625, 0.42857142857142855, 0.45161290322580644, 0.4857142857142857, 0.5925925925925926, 0.5, 0.5142857142857142, 0.375]}, 'edu_yrs': {'deciles': [0.0, 2.0, 4.0, 7.0, 9.600000000000009, 13.0, 16.0, 18.0, 21.200000000000017, 23.0, 25.0], 'min': 0.0, 'std': 7.86234623342895, 'quartiles': [6.0, 20.0], 'median': 13.0, 'decile_event_rate': [0.5384615384615384, 0.6521739130434783, 0.5151515151515151, 0.48, 0.6111111111111112, 0.5, 0.5, 0.6071428571428571, 0.5151515151515151, 0.6666666666666666], 'max': 25.0, 'mean': 12.73}, 'security_label': {'var_type': 'categorical', 'freq': {'C': 100, 'A': 100, 'B': 100}, 'event_rate': {'C': {'0.0': 0.5, '1.0': 0.5}, 'A': {'0.0': 0.4, '1.0': 0.6}, 'B': {'0.0': 0.43, '1.0': 0.57}}}, 'yrs_since_exercise': {'deciles': [0.0, 3.0, 5.0, 8.0, 12.0, 16.0, 20.0, 25.0, 31.0, 41.0, 60.0], 'min': 0.0, 'std': 14.417527732194037, 'quartiles': [6.0, 27.0], 'median': 16.0, 'decile_event_rate': [1.0, 1.0, 1.0, 0.9629629629629629, 0.75, 0.4857142857142857, 0.15384615384615385, 0.06666666666666667, 0.0, 0.0], 'max': 60.0, 'mean': 18.566666666666666}}
>>> new = json.dumps(a) # as a json string
>>> new
'{"income": {"deciles": [10198.0, 25269.4, 31325.800000000003, 37857.0, 43721.8, 52880.0, 63996.0, 72526.9, 82388.2, 89765.90000000001, 99783.0], "min": 10198.0, "std": 24846.911384024643, "quartiles": [35088.5, 78687.25], "mean": 55621.78666666667, "decile_event_rate": [0.6666666666666666, 0.6, 0.5333333333333333, 0.5666666666666667, 0.5, 0.6451612903225806, 0.4827586206896552, 0.5, 0.5666666666666667, 0.5], "max": 99783.0, "median": 52880.0}, "age": {"1": 25.0, "2": 50.25, "3": 50.0, "4": 75.0, "5": 14.580906789680968, "6": [25.0, 30.0, 34.800000000000004, 40.0, 46.60000000000001, 50.0, 56.0, 61.0, 65.0, 69.0, 75.0], "quartiles": [38.0, 64.0], "decile_event_rate": [0.8125, 0.7142857142857143, 0.65625, 0.42857142857142855, 0.45161290322580644, 0.4857142857142857, 0.5925925925925926, 0.5, 0.5142857142857142, 0.375]}, "edu_yrs": {"deciles": [0.0, 2.0, 4.0, 7.0, 9.600000000000009, 13.0, 16.0, 18.0, 21.200000000000017, 23.0, 25.0], "min": 0.0, "std": 7.86234623342895, "quartiles": [6.0, 20.0], "mean": 12.73, "decile_event_rate": [0.5384615384615384, 0.6521739130434783, 0.5151515151515151, 0.48, 0.6111111111111112, 0.5, 0.5, 0.6071428571428571, 0.5151515151515151, 0.6666666666666666], "max": 25.0, "median": 13.0}, "security_label": {"var_type": "categorical", "freq": {"A": 100, "C": 100, "B": 100}, "event_rate": {"A": {"0.0": 0.4, "1.0": 0.6}, "C": {"0.0": 0.5, "1.0": 0.5}, "B": {"0.0": 0.43, "1.0": 0.57}}}, "yrs_since_exercise": {"deciles": [0.0, 3.0, 5.0, 8.0, 12.0, 16.0, 20.0, 25.0, 31.0, 41.0, 60.0], "min": 0.0, "std": 14.417527732194037, "quartiles": [6.0, 27.0], "mean": 18.566666666666666, "decile_event_rate": [1.0, 1.0, 1.0, 0.9629629629629629, 0.75, 0.4857142857142857, 0.15384615384615385, 0.06666666666666667, 0.0, 0.0], "max": 60.0, "median": 16.0}}'