我一直在尝试将此列表与多个列表合并/解析为一个列表。
我要解析/合并的列表具有以下格式:
list_one = [ [['id1'],['value']],
[['id1'],['value1'],['value2'],['value3'],['value4'],['value5']],
[['id1'],['value6']],
[['id1'],['value7'],['value8']],
[['id2'],['value']],
[['id2'],['value1'],['value2'],['value3'],['value4'],['value5']],
[['id2'],['value6']],
[['id2'],['value7'],['value8']]
]
我在谷歌搜索后想出了这段代码:
pre_info = list(set(i[0] for i in itertools.chain.from_iterable(list_one)))
final_info = list(map(lambda x: [x], sorted(pre_info, key=len)))
print final_info
但它只打印我的ID
病变输出是:
final_list = [
[['id'],['value'],['value1'],['value2'],['value3'],['value4'],['value5'],['value6'],['value7'],['value8']],
[['id2'],['value'],['value1'],['value2'],['value3'],['value4'],['value5'],['value6'],['value7'],['value8']]
]
每一行的条件显然是'id',它始终是每个列表中的第一个位置。
答案 0 :(得分:3)
您需要按照唯一id
对您的值进行分组,而不是将事情弄平。您必须使用字典按id
对列表进行分组,或者,如果每个唯一id
的列表是连续的,请使用itertools.groupby()
。
使用字典:
by_id = {}
for id, *values in list_one:
# unwrap values as we add them to the id group
by_id.setdefault(id[0], []).extend(v[0] for v in values)
# extract all IDs an value lists into a new list
final_list = [[id] + values for id, values in sorted(by_id.items())]
或Python 2版本:
by_id = {}
for row in list_one:
# unwrap values as we add them to the id group
id, values = row[0][0], row[1:]
by_id.setdefault(id, []).extend(v[0] for v in values)
# extract all IDs an value lists into a new list
final_list = [[id] + values for id, values in sorted(by_id.items())]
我按id排序输出列表;词典没有固有的顺序。请注意,我删除了包装单例列表对象;这些都占用了你不需要使用的记忆,并且它们在算法上使问题复杂化。
如果您需要按首次出现的顺序排列这些列表,则可以list_one
使用id
。
如上所述,如果itertools.groupby()
列表已经连续,您可以使用from itertools import groupby
[[id] + [value[0] for sublist in group for value in sublist[1:]]
for id, group in groupby(list_one, lambda s: s[0][0])]
一步完成分组:
>>> by_id = {}
>>> for id, *values in list_one:
... # unwrap values as we add them to the id group
... by_id.setdefault(id[0], []).extend(v[0] for v in values)
...
>>> [[id] + values for id, values in sorted(by_id.items())]
[['id1', 'value', 'value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'value7', 'value8'], ['id2', 'value', 'value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'value7', 'value8']]
>>>
>>> from itertools import groupby
>>> [[id] + [value[0] for sublist in group for value in sublist[1:]]
... for id, group in groupby(list_one, lambda s: s[0][0])]
[['id1', 'value', 'value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'value7', 'value8'], ['id2', 'value', 'value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'value7', 'value8']]
演示:
MyArr.filter(x => isEven(x)).foreach(println)
如果您认为必须在输出中包含这些单例列表,请随时重新添加。
答案 1 :(得分:0)
你可以试试这个:
import collections
list_one = [ [['id1'],['value']],
[['id1'],['value1'],['value2'],['value3'],['value4'],['value5']],
[['id1'],['value6']],
[['id1'],['value7'],['value8']],
[['id2'],['value']],
[['id2'],['value1'],['value2'],['value3'],['value4'],['value5']],
[['id2'],['value6']],
[['id2'],['value7'],['value8']]
]
d = collections.defaultdict(list)
for row in list_one:
d[row[0][0]].extend(row[1:])
final_output = sorted([[[a]]+b for a, b in d.items()], key = lambda x: int(x[0][0][-1]))
最终输出:
[[['id1'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']], [['id2'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']]]
答案 2 :(得分:0)
上面的答案提供了很好的解决方案,这是另一种方法,但我同意@ Martijn Pieters♦和他在清晰阅读方面的解决方案
import itertools
chained = itertools.chain.from_iterable(list_one)
schain = set([tuple(c) for c in chained])
{('id',),
('value',),
('value1',),
('value2',),
('value3',),
('value4',),
('value5',),
('value6',),
('value7',),
('value8',)}
list(sorted([list(v) for v in schain]))
[['id'],
['value'],
['value1'],
['value2'],
['value3'],
['value4'],
['value5'],
['value6'],
['value7'],
['value8']]
根据其他值编辑,
temp = [list(v) for v in schain]
temp.pop(temp.index(['id']))
temp.sort()
temp.insert(0, ['id'])
[['id'],
['abc'],
['value'],
['value1'],
['value2'],
['value3'],
['value4'],
['value5'],
['value6'],
['value7'],
['value8']]
答案 3 :(得分:0)
我有这个解决方案,但只有当id是字符串或int并且必须位于每个列表的头部时它才有效:
l=[ [['id1'],['value']],
[['id1'],['value1'],['value2'],['value3'],['value4'],['value5']],
[['id1'],['value6']],
[['id1'],['value7'],['value8']],
[['id2'],['value']],
[['id2'],['value1'],['value2'],['value3'],['value4'],['value5']],
[['id2'],['value6']],
[['id2'],['value7'],['value8']]
]
d={}
for ll in l:
d[ll[0][0]]=[]
for i,ll in enumerate(l):
for lll in ll[1:]:
d[ll[0][0]].append(lll)
result=[]
for key,items in d.iteritems():
result.append([[key]]+items)
print result
结果:
[[['id2'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']], [['id1'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']]]