使用条件子句

时间:2017-08-24 15:38:45

标签: python list parsing conditional-statements

我一直在尝试将此列表与多个列表合并/解析为一个列表。

我要解析/合并的列表具有以下格式:

list_one = [ [['id1'],['value']], 
             [['id1'],['value1'],['value2'],['value3'],['value4'],['value5']], 
             [['id1'],['value6']],
             [['id1'],['value7'],['value8']],
             [['id2'],['value']], 
             [['id2'],['value1'],['value2'],['value3'],['value4'],['value5']], 
             [['id2'],['value6']],
             [['id2'],['value7'],['value8']]
]

我在谷歌搜索后想出了这段代码:

pre_info = list(set(i[0] for i in itertools.chain.from_iterable(list_one)))
final_info = list(map(lambda x: [x], sorted(pre_info, key=len)))
print final_info

但它只打印我的ID

病变输出是:

final_list = [
              [['id'],['value'],['value1'],['value2'],['value3'],['value4'],['value5'],['value6'],['value7'],['value8']],
              [['id2'],['value'],['value1'],['value2'],['value3'],['value4'],['value5'],['value6'],['value7'],['value8']]
]

每一行的条件显然是'id',它始终是每个列表中的第一个位置。

4 个答案:

答案 0 :(得分:3)

您需要按照唯一id对您的值进行分组,而不是将事情弄平。您必须使用字典按id对列表进行分组,或者,如果每个唯一id的列表是连续的,请使用itertools.groupby()

使用字典:

by_id = {}
for id, *values in list_one:
    # unwrap values as we add them to the id group
    by_id.setdefault(id[0], []).extend(v[0] for v in values)

# extract all IDs an value lists into a new list
final_list = [[id] + values for id, values in sorted(by_id.items())]

或Python 2版本:

by_id = {}
for row in list_one:
    # unwrap values as we add them to the id group
    id, values = row[0][0], row[1:]
    by_id.setdefault(id, []).extend(v[0] for v in values)

# extract all IDs an value lists into a new list
final_list = [[id] + values for id, values in sorted(by_id.items())]

我按id排序输出列表;词典没有固有的顺序。请注意,我删除了包装单例列表对象;这些都占用了你不需要使用的记忆,并且它们在算法上使问题复杂化。

如果您需要按首次出现的顺序排列这些列表,则可以list_one使用id

如上所述,如果itertools.groupby()列表已经连续,您可以使用from itertools import groupby [[id] + [value[0] for sublist in group for value in sublist[1:]] for id, group in groupby(list_one, lambda s: s[0][0])] 一步完成分组:

>>> by_id = {}
>>> for id, *values in list_one:
...     # unwrap values as we add them to the id group
...     by_id.setdefault(id[0], []).extend(v[0] for v in values)
...
>>> [[id] + values for id, values in sorted(by_id.items())]
[['id1', 'value', 'value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'value7', 'value8'], ['id2', 'value', 'value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'value7', 'value8']]
>>>
>>> from itertools import groupby
>>> [[id] + [value[0] for sublist in group for value in sublist[1:]]
...  for id, group in groupby(list_one, lambda s: s[0][0])]
[['id1', 'value', 'value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'value7', 'value8'], ['id2', 'value', 'value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'value7', 'value8']]

演示:

MyArr.filter(x => isEven(x)).foreach(println)

如果您认为必须在输出中包含这些单例列表,请随时重新添加。

答案 1 :(得分:0)

你可以试试这个:

import collections

list_one = [ [['id1'],['value']], 
         [['id1'],['value1'],['value2'],['value3'],['value4'],['value5']], 
         [['id1'],['value6']],
         [['id1'],['value7'],['value8']],
         [['id2'],['value']], 
         [['id2'],['value1'],['value2'],['value3'],['value4'],['value5']], 
         [['id2'],['value6']],
         [['id2'],['value7'],['value8']]
]

d = collections.defaultdict(list)
for row in list_one:
    d[row[0][0]].extend(row[1:])

final_output = sorted([[[a]]+b for a, b in d.items()], key = lambda x: int(x[0][0][-1]))

最终输出:

[[['id1'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']], [['id2'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']]]

答案 2 :(得分:0)

上面的答案提供了很好的解决方案,这是另一种方法,但我同意@ Martijn Pieters♦和他在清晰阅读方面的解决方案

import itertools

chained = itertools.chain.from_iterable(list_one)

schain = set([tuple(c) for c in chained])

{('id',),
 ('value',),
 ('value1',),
 ('value2',),
 ('value3',),
 ('value4',),
 ('value5',),
 ('value6',),
 ('value7',),
 ('value8',)}


list(sorted([list(v) for v in schain]))

[['id'],
 ['value'],
 ['value1'],
 ['value2'],
 ['value3'],
 ['value4'],
 ['value5'],
 ['value6'],
 ['value7'],
 ['value8']]

根据其他值编辑,

temp = [list(v) for v in schain]

temp.pop(temp.index(['id']))

temp.sort()

temp.insert(0, ['id'])

[['id'],
 ['abc'],
 ['value'],
 ['value1'],
 ['value2'],
 ['value3'],
 ['value4'],
 ['value5'],
 ['value6'],
 ['value7'],
 ['value8']]

答案 3 :(得分:0)

我有这个解决方案,但只有当id是字符串或int并且必须位于每个列表的头部时它才有效:

l=[ [['id1'],['value']], 
             [['id1'],['value1'],['value2'],['value3'],['value4'],['value5']], 
             [['id1'],['value6']],
             [['id1'],['value7'],['value8']],
             [['id2'],['value']], 
             [['id2'],['value1'],['value2'],['value3'],['value4'],['value5']], 
             [['id2'],['value6']],
             [['id2'],['value7'],['value8']]
]
d={}

for ll in l:
    d[ll[0][0]]=[]
for i,ll in enumerate(l):
    for lll in ll[1:]:
        d[ll[0][0]].append(lll)
result=[]
for key,items in d.iteritems():
    result.append([[key]]+items)

print result

结果:

[[['id2'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']], [['id1'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']]]