我有一些代码尝试使用一系列for
循环将字典从一种嵌套格式转换为另一种嵌套格式,以便我可以轻松地将字典导出到CSV文件。但是,当我的脚本循环输入dict时,它会覆盖输出dict而不是附加其他值,我无法弄清楚原因。
这是输入字典的格式:
{'data': [{'title': 'Lifetime Likes by Country',
'values': [{'end_time': '2013-11-10T08:00:00+0000',
'value': {'IN': 343818, 'PK': 212632, 'US': 886367}},
{'end_time': '2013-11-11T08:00:00+0000',
'value': {'IN': 344025, 'US': 886485}}]},
{'title': 'Daily Country: People Talking About This',
'values': [{'end_time': '2013-11-10T08:00:00+0000',
'value': {'IN': 289, 'US': 829}},
{'end_time': '2013-11-11T08:00:00+0000',
'value': {'IN': 262, 'US': 836}}]}]}
这是我的代码:
input_dict = function_to_get_input_dict()
filtered_dict = {}
for metric in input_dict['data']:
for day in metric['values']:
parsed_date = parser.parse(day['end_time'])
date_key = parsed_date.strftime('%m/%d/%Y')
filtered_dict[date_key] = {}
filtered_dict[date_key]['Total %s' % metric['title']] = 0
for k, v in day['value'].iteritems():
filtered_dict[date_key]['%s : %s' % (metric['title'], k)] = v
filtered_dict[date_key]['Total %s' % metric['title']] += v
pprint(filtered_dict) #debug
预期的输出字典格式:
{date1:{metric_1_each_country_code:value, metric_1_all_country_total:value, metric_2_each_country_code:value, metric_2_all_country_total:value}, date2:{etc}}
但是,我得到的输出字典每个日期只有一个指标:
{date1:{metric_2_each_country_code:value, metric_2_all_country_total:value}, date2:{etc}}
它似乎每次都覆盖度量键:值对,我不明白,因为使用['%s : %s' % (metric['title'], k)]
公式,每个度量标准的密钥应该是唯一的,因此它们不应被覆盖。
我错过了什么?
答案 0 :(得分:1)
如果您在代码中注意到,在第二个for
循环中,您有filtered_dict[date_key] = {}
。这会重置filtered_dict[date_key]
的值,而不是允许您添加它。
input_dict = function_to_get_input_dict()
filtered_dict = {}
for metric in input_dict['data']:
for day in metric['values']:
parsed_date = parser.parse(day['end_time'])
date_key = parsed_date.strftime('%m/%d/%Y')
filtered_dict[date_key] = {}
filtered_dict[date_key]['Total %s' % metric['title']] = 0
for k, v in day['value'].iteritems():
filtered_dict[date_key]['%s : %s' % (metric['title'], k)] = v
filtered_dict[date_key]['Total %s' % metric['title']] += v
pprint(filtered_dict) #debug
答案 1 :(得分:0)
我认为一个问题是你的数据中存在语法错误,几乎不可能看到结构。我已经纠正了它并且很好地印刷了整个东西以帮助你更好地看到它的结构。不是一个完整的答案,但它有助于解决问题:
import pprint; pprint.pprint({"data": [{ "values": [{ "value": { "US": 886367, "IN": 343818, "PK": 212632}, "end_time": "2013-11-10T08:00:00+0000"},{"value": { "US": 886485, "IN": 344025}, "end_time": "2013-11-11T08:00:00+0000"}], "title": "Lifetime Likes by Country"}, {"values": [{"value": { "US": 829, "IN": 289}, "end_time": "2013-11-10T08:00:00+0000"},{"value": {"US": 836,"IN": 262}, "end_time": "2013-11-11T08:00:00+0000"}], "title": "Daily Country: People Talking About This"}]})
{'data': [{'title': 'Lifetime Likes by Country',
'values': [{'end_time': '2013-11-10T08:00:00+0000',
'value': {'IN': 343818, 'PK': 212632, 'US': 886367}},
{'end_time': '2013-11-11T08:00:00+0000',
'value': {'IN': 344025, 'US': 886485}}]},
{'title': 'Daily Country: People Talking About This',
'values': [{'end_time': '2013-11-10T08:00:00+0000',
'value': {'IN': 289, 'US': 829}},
{'end_time': '2013-11-11T08:00:00+0000',
'value': {'IN': 262, 'US': 836}}]}]}
现在我可以看到数据的性质,也许这种类型的数据结构更适合您的需求:
import pprint; pprint.pprint({'Daily Country: People Talking About This': {'2013-11-11T08:00:00+0000': {'US': 836, 'IN': 262}, '2013-11-10T08:00:00+0000': {'US': 829, 'IN': 289}}, 'Lifetime Likes by Country': {'2013-11-11T08:00:00+0000': {'US': 886485, 'IN': 344025}, '2013-11-10T08:00:00+0000': {'PK': 212632, 'US': 886367, 'IN': 343818}}})
这给了你:
{'Daily Country: People Talking About This': {'2013-11-10T08:00:00+0000': {'IN': 289,
'US': 829},
'2013-11-11T08:00:00+0000': {'IN': 262,
'US': 836}},
'Lifetime Likes by Country': {'2013-11-10T08:00:00+0000': {'IN': 343818,
'PK': 212632,
'US': 886367},
'2013-11-11T08:00:00+0000': {'IN': 344025,
'US': 886485}}}