对于需要解决的问题,我需要帮助,但不能使用大熊猫或numpy。我有两个字典列表,即list1和list2。我需要按“ post_code”对list2进行排序,并将其分组 e在通过具有相同值的两个不同键将list1和list2连接在一起之前,按“代码”对list2进行排序。在列表1中,键“实践”等效于已排序的列表2中的键“代码”。我需要使用“练习”和“代码”的等效键来连接list1和list2。
list1=
[{'bnf_code': '0101010G0AAABAB',
'items': 2,
'practice': 'N81013',
'bnf_name': 'Co-Magaldrox_Susp 195mg/220mg/5ml S/F',
'nic': 5.98,
'act_cost': 5.56,
'quantity': 1000},
{'bnf_code': '0101021B0AAAHAH',
'items': 1,
'practice': 'A81001',
'bnf_name': 'Alginate_Raft-Forming Oral Susp S/F',
'nic': 1.95,
'act_cost': 1.82,
'quantity': 500},
{'bnf_code': '0101021B0AAALAL',
'items': 12,
'practice': 'A81002',
'bnf_name': 'Sod Algin/Pot Bicarb_Susp S/F',
'nic': 64.51,
'act_cost': 59.95,
'quantity': 6300},
{'bnf_code': '0101021B0AAAPAP',
'items': 3,
'practice': 'A81004',
'bnf_name': 'Sod Alginate/Pot Bicarb_Tab Chble 500mg',
'nic': 9.21,
'act_cost': 8.55,
'quantity': 180},
{'bnf_code': '0101021B0BEADAJ',
'items': 6,
'practice': 'A81003',
'bnf_name': 'Gaviscon Infant_Sach 2g (Dual Pack) S/F',
'nic': 28.92,
'act_cost': 26.84,
'quantity': 90}]
list2=
[{'code': 'A81001',
'name': 'THE DENSHAM SURGERY',
'addr_1': 'THE HEALTH CENTRE',
'addr_2': 'LAWSON STREET',
'borough': 'STOCKTON ON TEES',
'village': 'CLEVELAND',
'post_code': 'TS18 1HU'},
{'code': 'A81002',
'name': 'QUEENS PARK MEDICAL CENTRE',
'addr_1': 'QUEENS PARK MEDICAL CTR',
'addr_2': 'FARRER STREET',
'borough': 'STOCKTON ON TEES',
'village': 'CLEVELAND',
'post_code': 'TS18 2AW'},
{'code': 'A81003',
'name': 'VICTORIA MEDICAL PRACTICE',
'addr_1': 'THE HEALTH CENTRE',
'addr_2': 'VICTORIA ROAD',
'borough': 'HARTLEPOOL',
'village': 'CLEVELAND',
'post_code': 'TS26 8DB'},
{'code': 'A81004',
'name': 'WOODLANDS ROAD SURGERY',
'addr_1': '6 WOODLANDS ROAD',
'addr_2': None,
'borough': 'MIDDLESBROUGH',
'village': 'CLEVELAND',
'post_code': 'TS1 3BE'},
{'code': 'N81013',
'name': 'SPRINGWOOD SURGERY',
'addr_1': 'SPRINGWOOD SURGERY',
'addr_2': 'RECTORY LANE',
'borough': 'GUISBOROUGH',
'village': None,
'post_code': 'TS14 7DJ'}]
我已经能够按post_code排序list2并按代码分组,但是我对如何加入list1和list2迷失了。这是我到目前为止用于排序和分组的代码。
import itertools
from operator import itemgetter
sorted_post_code = sorted(list2, key=itemgetter('post_code'))
for key, group in itertools.groupby(sorted_post_code, key=lambda x:x['code']):
#print (key),
print (list(group))
预期的产量是
joined_list=
list1=
[{'bnf_code': '0101010G0AAABAB',
'items': 2,
'practice': 'N81013',
'bnf_name': 'Co-Magaldrox_Susp 195mg/220mg/5ml S/F',
'nic': 5.98,
'act_cost': 5.56,
'quantity': 1000,
'code': 'N81013',
'name': 'SPRINGWOOD SURGERY',
'addr_1': 'SPRINGWOOD SURGERY',
'addr_2': 'RECTORY LANE',
'borough': 'GUISBOROUGH',
'village': None,
'post_code': 'TS14 7DJ'},
{'bnf_code': '0101021B0AAAHAH',
'items': 1,
'practice': 'A81001',
'bnf_name': 'Alginate_Raft-Forming Oral Susp S/F',
'nic': 1.95,
'act_cost': 1.82,
'quantity': 500,
'code': 'A81001',
'name': 'THE DENSHAM SURGERY',
'addr_1': 'THE HEALTH CENTRE',
'addr_2': 'LAWSON STREET',
'borough': 'STOCKTON ON TEES',
'village': 'CLEVELAND',
'post_code': 'TS18 1HU'},
{'bnf_code': '0101021B0AAALAL',
'items': 12,
'practice': 'A81002',
'bnf_name': 'Sod Algin/Pot Bicarb_Susp S/F',
'nic': 64.51,
'act_cost': 59.95,
'quantity': 6300,
'code': 'A81002',
'name': 'QUEENS PARK MEDICAL CENTRE',
'addr_1': 'QUEENS PARK MEDICAL CTR',
'addr_2': 'FARRER STREET',
'borough': 'STOCKTON ON TEES',
'village': 'CLEVELAND',
'post_code': 'TS18 2AW'},
{'bnf_code': '0101021B0AAAPAP',
'items': 3,
'practice': 'A81004',
'bnf_name': 'Sod Alginate/Pot Bicarb_Tab Chble 500mg',
'nic': 9.21,
'act_cost': 8.55,
'quantity': 180,
'code': 'A81004',
'name': 'WOODLANDS ROAD SURGERY',
'addr_1': '6 WOODLANDS ROAD',
'addr_2': None,
'borough': 'MIDDLESBROUGH',
'village': 'CLEVELAND',
'post_code': 'TS1 3BE'},
{'bnf_code': '0101021B0BEADAJ',
'items': 6,
'practice': 'A81003',
'bnf_name': 'Gaviscon Infant_Sach 2g (Dual Pack) S/F',
'nic': 28.92,
'act_cost': 26.84,
'quantity': 90,
'code': 'A81003',
'name': 'VICTORIA MEDICAL PRACTICE',
'addr_1': 'THE HEALTH CENTRE',
'addr_2': 'VICTORIA ROAD',
'borough': 'HARTLEPOOL',
'village': 'CLEVELAND',
'post_code': 'TS26 8DB'}]
答案 0 :(得分:1)
我了解到,如果字典的键“ code”和“ practice”的值匹配,则希望list1中的每个词典都包含list2中该词典的所有条目。
如果是这样,您可以轻松地用其他词典中的条目更新词典中的所有条目。缺少键:将添加值对,而现有键将更新其值。
所以我最终遇到了double for循环,这是我在进行任何排序之前所做的。您可能要根据需要进行调整。
for entry2 in list2:
for entry1 in list1:
if entry2['code'] == entry1['practice']:
entry1.update(entry2)
可以在以下位置找到有关加入字典的不同方式的很长的解释:https://stackoverflow.com/a/26853961/6218902
答案 1 :(得分:1)
defaultdict
对于分组操作而言可能做得相当不错。您可以使用字典来更新分组的元素:
from collections import defaultdict
groups = defaultdict(dict)
# to show this explicitly you can start with two loops
# not the most efficient, but it shows the process
for item in list1:
k = item['practice']
groups[k].update(item)
for item in list2:
k = item['code']
groups[k].update(item)
# where groups.values() will have your "joined"
# dictionaries
groups
{
"N81013": {
"bnf_code": "0101010G0AAABAB",
"items": 2,
"practice": "N81013",
"bnf_name": "Co-Magaldrox_Susp 195mg/220mg/5ml S/F",
"nic": 5.98,
"act_cost": 5.56,
"quantity": 1000,
"code": "N81013",
"name": "SPRINGWOOD SURGERY",
"addr_1": "SPRINGWOOD SURGERY",
"addr_2": "RECTORY LANE",
"borough": "GUISBOROUGH",
"village": null,
"post_code": "TS14 7DJ"
},
"A81001": {
"bnf_code": "0101021B0AAAHAH",
"items": 1,
"practice": "A81001",
"bnf_name": "Alginate_Raft-Forming Oral Susp S/F",
"nic": 1.95,
"act_cost": 1.82,
"quantity": 500,
"code": "A81001",
"name": "THE DENSHAM SURGERY",
"addr_1": "THE HEALTH CENTRE",
"addr_2": "LAWSON STREET",
"borough": "STOCKTON ON TEES",
"village": "CLEVELAND",
"post_code": "TS18 1HU"
},
"A81002": {
"bnf_code": "0101021B0AAALAL",
"items": 12,
"practice": "A81002",
"bnf_name": "Sod Algin/Pot Bicarb_Susp S/F",
"nic": 64.51,
"act_cost": 59.95,
"quantity": 6300,
"code": "A81002",
"name": "QUEENS PARK MEDICAL CENTRE",
"addr_1": "QUEENS PARK MEDICAL CTR",
"addr_2": "FARRER STREET",
"borough": "STOCKTON ON TEES",
"village": "CLEVELAND",
"post_code": "TS18 2AW"
},
"A81004": {
"bnf_code": "0101021B0AAAPAP",
"items": 3,
"practice": "A81004",
"bnf_name": "Sod Alginate/Pot Bicarb_Tab Chble 500mg",
"nic": 9.21,
"act_cost": 8.55,
"quantity": 180,
"code": "A81004",
"name": "WOODLANDS ROAD SURGERY",
"addr_1": "6 WOODLANDS ROAD",
"addr_2": null,
"borough": "MIDDLESBROUGH",
"village": "CLEVELAND",
"post_code": "TS1 3BE"
},
"A81003": {
"bnf_code": "0101021B0BEADAJ",
"items": 6,
"practice": "A81003",
"bnf_name": "Gaviscon Infant_Sach 2g (Dual Pack) S/F",
"nic": 28.92,
"act_cost": 26.84,
"quantity": 90,
"code": "A81003",
"name": "VICTORIA MEDICAL PRACTICE",
"addr_1": "THE HEALTH CENTRE",
"addr_2": "VICTORIA ROAD",
"borough": "HARTLEPOOL",
"village": "CLEVELAND",
"post_code": "TS26 8DB"
}
}
通常,由于键是唯一的,因此字典非常适合分组操作。一个更优化的操作可能是将两个列表一起zip
,因为您将进行更新:
from itertools import zip_longest
from collections import defaultdict
groups = defaultdict(dict)
def group_item(a, b):
a_key, b_key = a['practice'] if a else None, b['code'] if b else None
return a_key, b_key
for a, b in zip_longest(list1, list2):
ak, bk = group_item(a, b)
if ak:
groups[ak].update(a)
if bk:
groups[bk].update(b)
# sort list of groups.values() now
list(groups.values())
[{'bnf_code': '0101010G0AAABAB', 'items': 2, 'practice': 'N81013', 'bnf_name': 'Co-Magaldrox_Susp 195mg/220mg/5ml S/F', 'nic': 5.98, 'act_cost': 5.56, 'quantity': 1000, 'code': 'N81013', 'name': 'SPRINGWOOD SURGERY', 'addr_1': 'SPRINGWOOD SURGERY', 'addr_2': 'RECTORY LANE', 'borough': 'GUISBOROUGH', 'village': None, 'post_code': 'TS14 7DJ'}, {'code': 'A81001', 'name': 'THE DENSHAM SURGERY', 'addr_1': 'THE HEALTH CENTRE', 'addr_2': 'LAWSON STREET', 'borough': 'STOCKTON ON TEES', 'village': 'CLEVELAND', 'post_code': 'TS18 1HU', 'bnf_code': '0101021B0AAAHAH', 'items': 1, 'practice': 'A81001', 'bnf_name': 'Alginate_Raft-Forming Oral Susp S/F', 'nic': 1.95, 'act_cost': 1.82, 'quantity': 500}, {'code': 'A81002', 'name': 'QUEENS PARK MEDICAL CENTRE', 'addr_1': 'QUEENS PARK MEDICAL CTR', 'addr_2': 'FARRER STREET', 'borough': 'STOCKTON ON TEES', 'village': 'CLEVELAND', 'post_code': 'TS18 2AW', 'bnf_code': '0101021B0AAALAL', 'items': 12, 'practice': 'A81002', 'bnf_name': 'Sod Algin/Pot Bicarb_Susp S/F', 'nic': 64.51, 'act_cost': 59.95, 'quantity': 6300}, {'code': 'A81003', 'name': 'VICTORIA MEDICAL PRACTICE', 'addr_1': 'THE HEALTH CENTRE', 'addr_2': 'VICTORIA ROAD', 'borough': 'HARTLEPOOL', 'village': 'CLEVELAND', 'post_code': 'TS26 8DB', 'bnf_code': '0101021B0BEADAJ', 'items': 6, 'practice': 'A81003', 'bnf_name': 'Gaviscon Infant_Sach 2g (Dual Pack) S/F', 'nic': 28.92, 'act_cost': 26.84, 'quantity': 90}, {'bnf_code': '0101021B0AAAPAP', 'items': 3, 'practice': 'A81004', 'bnf_name': 'Sod Alginate/Pot Bicarb_Tab Chble 500mg', 'nic': 9.21, 'act_cost': 8.55, 'quantity': 180, 'code': 'A81004', 'name': 'WOODLANDS ROAD SURGERY', 'addr_1': '6 WOODLANDS ROAD', 'addr_2': None, 'borough': 'MIDDLESBROUGH', 'village': 'CLEVELAND', 'post_code': 'TS1 3BE'}]
我在这里使用zip_longest
,如果您的list1
和list2
的长度不相等,则由于大小差异,循环不会被提前截断。要按邮政编码进行排序,请执行与之前相同的操作:
x = sorted(groups.values(), key=operator.itemgetter('post_code'))
但是,这意味着密钥的存在。对于更通用的方法,最好使用lambda
并使用带有默认返回值的get
:
x = sorted(groups.values(), key=lambda x: x.get('post_code', ' '))