根据重复值合并嵌套列表项

时间:2013-06-24 22:32:02

标签: python list

虽然写得不好,但这段代码:

marker_array = [['hard','2','soft'],['heavy','2','light'],['rock','2','feather'],['fast','3'], ['turtle','4','wet']]
marker_array_DS = []

for i in range(len(marker_array)):
    if marker_array[i-1][1] != marker_array[i][1]:            
        marker_array_DS.append(marker_array[i])

print marker_array_DS

返回:

[['hard', '2', 'soft'], ['fast', '3'], ['turtle', '4', 'wet']]

它完成了部分任务,即创建一个包含所有嵌套列表的新列表,但索引[1]中具有重复值的列表除外。但我真正需要的是连接已删除列表中的匹配索引值,创建如下列表:

[['hard heavy rock', '2', 'soft light feather'], ['fast', '3'], ['turtle', '4', 'wet']]

索引[1]中的值不能连接在一起。我使用另一篇文章中的提示设法进行连接部分:

newlist = [i + n for i, n in zip(list_a, list_b]

但我正在努力找出产生预期结果的方法。 “marker_array”列表在传递给此代码之前已按升序排序。 index [1]位置中的所有like-values都是连续的。如上所示,某些嵌套列表可能没有[0]和[1]之外的任何值。

5 个答案:

答案 0 :(得分:2)

快速刺入...使用itertools.groupby为您进行分组,但是通过将2元素列表转换为3元素的生成器进行分组。

from itertools import groupby
from operator import itemgetter

marker_array = [['hard','2','soft'],['heavy','2','light'],['rock','2','feather'],['fast','3'], ['turtle','4','wet']]  

def my_group(iterable):
    temp = ((el + [''])[:3] for el in marker_array)
    for k, g in groupby(temp, key=itemgetter(1)):
        fst, snd = map(' '.join, zip(*map(itemgetter(0, 2), g)))
        yield filter(None, [fst, k, snd])

print list(my_group(marker_array))

答案 1 :(得分:0)

from collections import defaultdict
d1, d2 = defaultdict(list) , defaultdict(list)
for pxa in marker_array:
    d1[pxa[1]].extend(pxa[:1])
    d2[pxa[1]].extend(pxa[2:])

res = [[' '.join(d1[x]), x, ' '.join(d2[x])] for x in sorted(d1)]

如果你真的需要2元组(我认为不太可能):

for p in res:
    if not p[-1]:
        p.pop()

答案 2 :(得分:0)

marker_array = [['hard','2','soft'],['heavy','2','light'],['rock','2','feather'],['fast','3'], ['turtle','4','wet']]
marker_array_DS = []
marker_array_hit = []

for i in range(len(marker_array)):
    if marker_array[i][1] not in marker_array_hit:
        marker_array_hit.append(marker_array[i][1])

for i in marker_array_hit:
    lists = [item for item in marker_array if item[1] == i]
    temp = []
    first_part = ' '.join([str(item[0]) for item in lists])
    temp.append(first_part)
    temp.append(i)
    second_part = ' '.join([str(item[2]) for item in lists if len(item) > 2])
    if second_part != '':
        temp.append(second_part);
    marker_array_DS.append(temp)

print marker_array_DS

我为此学习了python,因为我是一个无耻的代表妓女

答案 3 :(得分:0)

marker_array = [
    ['hard','2','soft'],
    ['heavy','2','light'],
    ['rock','2','feather'],
    ['fast','3'], 
    ['turtle','4','wet'],
]

data = {}

for arr in marker_array:
    if len(arr) == 2:
        arr.append('')

    (first, index, last) = arr
    firsts, lasts = data.setdefault(index, [[],[]])
    firsts.append(first)
    lasts.append(last)


results = []

for key in sorted(data.keys()):
    current = [
        " ".join(data[key][0]),
        key,
        " ".join(data[key][1])
    ]

    if current[-1] == '':
        current = current[:-1]

    results.append(current)



print results

--output:--
[['hard heavy rock', '2', 'soft light feather'], ['fast', '3'], ['turtle', '4', 'wet']]

答案 4 :(得分:-1)

基于itertools.groupby的不同解决方案:

from itertools import groupby

# normalizes the list of markers so all markers have 3 elements
def normalized(markers):
    for marker in markers:
        yield marker + [""] * (3 - len(marker))

def concatenated(markers):
  # use groupby to iterator over lists of markers sharing the same key
  for key, markers_in_category in groupby(normalized(markers), lambda m: m[1]):
    # get separate lists of left and right words
    lefts, rights = zip(*[(m[0],m[2]) for m in markers_in_category])
    # remove empty strings from both lists
    lefts, rights = filter(bool, lefts), filter(bool, rights)
    # yield the concatenated entry for this key (also removing the empty string at the end, if necessary)
    yield filter(bool, [" ".join(lefts), key, " ".join(rights)])

生成器concatenated(markers)将产生结果。此代码正确处理['fast', '3']情况,并且在这种情况下不会返回额外的第三个元素。