从python中复杂的列表和字典列表中提取元素

时间:2017-10-01 16:10:36

标签: python list dictionary nested-lists

我有一个列表,其中列出了许多代表纽约地铁车辆的列表和词典:

1:010> |0s
0:000> lmf
[...]
No Adobe Acrobat Reader DLLs
[...]
0:000> |1s
1:010> lmf
[...]
56910000 56961000   sqlite   C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\sqlite.dll
56970000 569a4000   AXE8SharedExpat C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\AXE8SharedExpat.dll
569b0000 56a9c000   ACE      C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\ACE.dll
56aa0000 56d78000   CoolType C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\CoolType.dll
56d80000 56d9e000   BIB      C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\BIB.dll
56da0000 572c2000   AGM      C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\AGM.dll
[...]

我正在尝试识别与特定stop_id相关联的条目。例如,如果我在搜索'D03N',我想返回与之关联的整个条目:

[[{'arrival': {'time': 1506873749L},
   'departure': {'time': 1506873749L},
   'schedule_relationship': 0,
   'stop_id': u'B20S'},
  {'arrival': {'time': 1506873854L},
   'departure': {'time': 1506873854L},
   'schedule_relationship': 0,
   'stop_id': u'B21S'},
  {'arrival': {'time': 1506873989L},
   'departure': {'time': 1506873989L},
   'schedule_relationship': 0,
   'stop_id': u'B22S'},
  {'arrival': {'time': 1506874184L},
   'departure': {'time': 1506874184L},
   'schedule_relationship': 0,
   'stop_id': u'B23S'},
  {'arrival': {'time': 1506874469L},
   'departure': {'time': 1506874469L},
   'schedule_relationship': 0,
   'stop_id': u'D43S'}],
 [{'arrival': {'time': 1506873814L},
   'departure': {'time': 1506873814L},
   'schedule_relationship': 0,
   'stop_id': u'D10N'},
  {'arrival': {'time': 1506873877L},
   'departure': {'time': 1506873877L},
   'schedule_relationship': 0,
   'stop_id': u'D09N'},
  {'arrival': {'time': 1506873997L},
   'departure': {'time': 1506873997L},
   'schedule_relationship': 0,
   'stop_id': u'D08N'},
  {'arrival': {'time': 1506874087L},
   'departure': {'time': 1506874087L},
   'schedule_relationship': 0,
   'stop_id': u'D07N'},
  {'arrival': {'time': 1506874177L},
   'departure': {'time': 1506874177L},
   'schedule_relationship': 0,
   'stop_id': u'D06N'},
  {'arrival': {'time': 1506874267L},
   'departure': {'time': 1506874267L},
   'schedule_relationship': 0,
   'stop_id': u'D05N'},
  {'arrival': {'time': 1506874357L},
   'departure': {'time': 1506874357L},
   'schedule_relationship': 0,
   'stop_id': u'D04N'},
  {'arrival': {'time': 1506874477L},
   'departure': {'time': 1506874477L},
   'schedule_relationship': 0,
   'stop_id': u'D03N'},
  {'arrival': {'time': 1506874627L},
   'departure': {'time': 1506874627L},
   'schedule_relationship': 0,
   'stop_id': u'D01N'}]]

不幸的是,每当我尝试使用此答案中的建议时:Python list of dictionaries search 我最终得到'TypeError:list indices必须是整数,而不是str'错误消息。我不确定这是因为我正在错误地实现该解决方案,或者解决方案不适用,因为此列表与原始问题中的相对复杂。

有没有办法从这个列表中提取特定条目?

4 个答案:

答案 0 :(得分:2)

l = <your list>
[ i for i in sum(l,[]) if i['stop_id'] == 'D03N' ]

或更有效的方式

from itertools import chain
[ i for i in chain.from_iterable(l) if i['stop_id'] == 'D03N' ]

答案 1 :(得分:1)

>>> from itertools import chain
>>> data = [[{'arrival': {'time': 1506873749L}, 'departure': {'time': 1506873749L}, 'schedule_relationship': 0, 'stop_id': u'B20S'}, {'arrival': {'time': 1506873854L}, 'departure': {'time': 1506873854L}, 'schedule_relationship': 0, 'stop_id': u'B21S'}, {'arrival': {'time': 1506873989L}, 'departure': {'time': 1506873989L}, 'schedule_relationship': 0, 'stop_id': u'B22S'}, {'arrival': {'time': 1506874184L}, 'departure': {'time': 1506874184L}, 'schedule_relationship': 0, 'stop_id': u'B23S'}, {'arrival': {'time': 1506874469L}, 'departure': {'time': 1506874469L}, 'schedule_relationship': 0, 'stop_id': u'D43S'}], [{'arrival': {'time': 1506873814L}, 'departure': {'time': 1506873814L}, 'schedule_relationship': 0, 'stop_id': u'D10N'}, {'arrival': {'time': 1506873877L}, 'departure': {'time': 1506873877L}, 'schedule_relationship': 0, 'stop_id': u'D09N'}, {'arrival': {'time': 1506873997L}, 'departure': {'time': 1506873997L}, 'schedule_relationship': 0, 'stop_id': u'D08N'}, {'arrival': {'time': 1506874087L}, 'departure': {'time': 1506874087L}, 'schedule_relationship': 0, 'stop_id': u'D07N'}, {'arrival': {'time': 1506874177L}, 'departure': {'time': 1506874177L}, 'schedule_relationship': 0, 'stop_id': u'D06N'}, {'arrival': {'time': 1506874267L}, 'departure': {'time': 1506874267L}, 'schedule_relationship': 0, 'stop_id': u'D05N'}, {'arrival': {'time': 1506874357L}, 'departure': {'time': 1506874357L}, 'schedule_relationship': 0, 'stop_id': u'D04N'}, {'arrival': {'time': 1506874477L}, 'departure': {'time': 1506874477L}, 'schedule_relationship': 0, 'stop_id': u'D03N'}, {'arrival': {'time': 1506874627L}, 'departure': {'time': 1506874627L}, 'schedule_relationship': 0, 'stop_id': u'D01N'}]]

>>> def find(s):
        found = [x for x in chain(*data) if x['stop_id']==s]
        return found[0] if found else None

>>> find(u'D03N')
{'arrival': {'time': 1506874477L}, 'schedule_relationship': 0, 'departure': {'time': 1506874477L}, 'stop_id': u'D03N'}

答案 2 :(得分:1)

这是一个适用于任何级别的嵌套列表的递归解决方案。 此函数搜索(DFS)列表,就像它的列表是根节点,子列表是父节点,字典是叶节点。

def find_by_stopid(at, target, saveto):
    if isinstance(at, dict):
        if at['stop_id'] == target:
            saveto.append(at)
        return

    for x in at:
        find_by_stopid(x, target, saveto)

found = []
target = u'D03N'

# data is the list you have, targets is the string to match
# and found is where matches are saved
find_by_stopid(data, target, found)

print(found)

答案 3 :(得分:0)

你可以试试这个:

entry = 'D03N'
final_entries = [[b for b in i if b["stop_id"] == entry] for i in entry_data]
try:
   new_final_entries = [i for i in final_entries if i][0][0]
except:
   print("Entry not found")

其中输入数据是原始问题中发布的完整字典。

输出:

{'arrival': {'time': 1506874477L}, 'schedule_relationship': 0, 'departure': {'time': 1506874477L}, 'stop_id': u'D03N'}