按2个值对过滤两个字典列表并将它们组合在一起

时间:2018-04-29 03:02:38

标签: python sorting dictionary filter

我有2个词典列表,让我们说:

List_D1 = [{'Symbol':'GFX','Time':'9:36am', 'Change':-0.18, 'Volume':181800},
            {'Symbol':'AIG','Time':'9:36am', 'Change':-0.15, 'Volume': 195500},
            {'Symbol':'AXP','Time':'9:36am', 'Change':-0.46, 'Volume': 935000},
            ]
List_D2 = [{'Symbol':'AA','Time':'7:36am', 'Change':-0.08, 'Volume':181800},
            {'Symbol':'AIG','Time':'9:36am', 'Change':0.99, 'Volume': 197500},
            {'Symbol':'GFX','Time':'9:36am', 'Change':-0.46, 'Volume': 935000},
            ]

我想在单独的列表中选择相同的符号'符号'和'时间'值: 在上面的示例中,它应该配对:

对1:

List_D1 : {'Symbol':'AIG','Time':'9:36am', 'Change':-0.15, 'Volume': 195500} 
List_D2 : {'Symbol':'AIG','Time':'9:36am', 'Change':0.99, 'Volume': 197500}

配对2:

List_D1 :{'Symbol':'GFX','Time':'9:36am', 'Change':-0.18, 'Volume':181800}
List_D2 :{'Symbol':'GFX','Time':'9:36am', 'Change':-0.46, 'Volume': 935000}

现在我只是通过不同的字典列表中的每个条目,我想知道有更好的想法更有效地完成这项工作吗?

我正在考虑将python' itemgetter 用于sort(List_D1+List_D2),然后使用groupby函数将整个排序列表和我要配对的组项配对。但是,通过这样做,我无法确定哪个项目来自哪个列表。

Here is my source code :

from operator import itemgetter
from itertools import groupby

ListsBoth = List_D1+List_D2

key1 = 'Symbol' 
key2 = 'Time'
grouper = itemgetter(key1,key2)
ResuListx2.sort(key=grouper)
for key, testItem in groupby(ListsBoth,key=grouper):
        // here I can group all items with same 'Symbol' AND 'Time' value together, but just missed the original "List" info - where each item in same group comes from. but I need it for my application.
    ...... handle each item in testItem ()

3 个答案:

答案 0 :(得分:0)

您可以将每个dicts列表转换为带有符号元组和时间作为键的dict,然后在两者之间进行简单查找以创建您要查找的对,例如:

In []:
D1 = {(d['Symbol'], d['Time']): d for d in List_D1}
D2 = {(d['Symbol'], d['Time']): d for d in List_D2}
[(D1.get(k, None), D2.get(k, None)) for k in set(D1) | set(D2)]

Out[]:
[({'Change': -0.18, 'Symbol': 'GFX', 'Time': '9:36am', 'Volume': 181800},
  {'Change': -0.46, 'Symbol': 'GFX', 'Time': '9:36am', 'Volume': 935000}),
 ({'Change': -0.15, 'Symbol': 'AIG', 'Time': '9:36am', 'Volume': 195500},
  {'Change': 0.99, 'Symbol': 'AIG', 'Time': '9:36am', 'Volume': 197500}),
 ({'Change': -0.46, 'Symbol': 'AXP', 'Time': '9:36am', 'Volume': 935000}, None),
 (None, {'Change': -0.08, 'Symbol': 'AA', 'Time': '7:36am', 'Volume': 181800})]

您可以通过将其更改为:

来消除任何不匹配的对
[(D1[k], D2[k]) for k in D1 if k in D2]

现在你可以按照你需要做的事情迭代每一对,例如:

In []:
results = [(D1[k], D2[k]) for k in D1 if k in D2]
for l1, l2 in results:
    print(l1, l2)

Out[]:
{'Symbol': 'GFX', 'Time': '9:36am', 'Change': -0.18, 'Volume': 181800} {'Symbol': 'GFX', 'Time': '9:36am', 'Change': -0.46, 'Volume': 935000}
{'Symbol': 'AIG', 'Time': '9:36am', 'Change': -0.15, 'Volume': 195500} {'Symbol': 'AIG', 'Time': '9:36am', 'Change': 0.99, 'Volume': 197500}

答案 1 :(得分:0)

List_D1 = [{'Symbol': 'GFX', 'Time': '9:36am', 'Change': -0.18, 'Volume': 181800},
           {'Symbol': 'AIG', 'Time': '9:36am', 'Change': -0.15, 'Volume': 195500},
           {'Symbol': 'AXP', 'Time': '9:36am', 'Change': -0.46, 'Volume': 935000},
           ]
List_D2 = [{'Symbol': 'AA', 'Time': '7:36am', 'Change': -0.08, 'Volume': 181800},
           {'Symbol': 'AIG', 'Time': '9:36am', 'Change': 0.99, 'Volume': 197500},
           {'Symbol': 'GFX', 'Time': '9:36am', 'Change': -0.46, 'Volume': 935000},
           ]

b = map(lambda x: x.get('Symbol') + '_' + x.get('Time'), List_D1)
c = map(lambda x: x.get('Symbol') + '_' + x.get('Time'), List_D2)
e = map(lambda x: (List_D1[b.index(x)], List_D2[c.index(x)]), set(b) & set(c))
for i in e:
    print(i)

答案 2 :(得分:0)

您还可以使用itertools.groupby,然后仅保存包含多个结果项的结果:

import itertools
List_D1 = [{'Symbol':'GFX','Time':'9:36am', 'Change':-0.18, 'Volume':181800},
        {'Symbol':'AIG','Time':'9:36am', 'Change':-0.15, 'Volume': 195500},
        {'Symbol':'AXP','Time':'9:36am', 'Change':-0.46, 'Volume': 935000},
        ]
List_D2 = [{'Symbol':'AA','Time':'7:36am', 'Change':-0.08, 'Volume':181800},
        {'Symbol':'AIG','Time':'9:36am', 'Change':0.99, 'Volume': 197500},
        {'Symbol':'GFX','Time':'9:36am', 'Change':-0.46, 'Volume': 935000},
        ]
d = [(a, list(b)) for a, b in itertools.groupby(sorted(List_D1+List_D2, key=lambda x:(x['Symbol'], x['Time'])), key=lambda x:(x['Symbol'], x['Time']))]
final_data = {a:b for a, b in d if len(b) > 1}

输出:

{('AIG', '9:36am'): [{'Symbol': 'AIG', 'Time': '9:36am', 'Change': -0.15, 'Volume': 195500}, {'Symbol': 'AIG', 'Time': '9:36am', 'Change': 0.99, 'Volume': 197500}], ('GFX', '9:36am'): [{'Symbol': 'GFX', 'Time': '9:36am', 'Change': -0.18, 'Volume': 181800}, {'Symbol': 'GFX', 'Time': '9:36am', 'Change': -0.46, 'Volume': 935000}]}