Question

我有一个列表列表，每次都有不同数量的列表，具体取决于其他条件。每个列表包含4个项目。我想说两个内部列表中的元素1,2,3是否相同，添加元素0并删除重复项。该列表看起来像这样：

 [
    [4, 'blue', 'round', None],
    [6, 'blue', 'round', None],
    [8, 'red', 'round', None],
    [10, 'red', 'round', None],
    [8, 'red', 'square', None],
]

我认为制作新名单可能会有所帮助，但我不确定。我需要最终产品：

[
    [10, 'blue', 'round', None],
    [18, 'red', 'round', None],
    [8, 'red', 'square', None],
]

列表中不同列表的数量总是不同的。对此的任何帮助将不胜感激。

Answer 1

您可以尝试使用来自itertools的groupby，其原始列表按子列表的最后三个元素排序：

lst=[[4,"blue","round","none"], [6,"blue","round","none"], [8,"red","round","none"], [10,"red","round","none"], [8,"red","square","none"]]

from itertools import groupby
[[sum(v[0] for v in g)] + k for k, g in groupby(sorted(lst, key = lambda x: x[1:4]), key = lambda x: x[1:4])]

# [[10, 'blue', 'round', 'none'],
#  [18, 'red', 'round', 'none'],
#  [8, 'red', 'square', 'none']]

Answer 2

您可以使用计数器累积结果：

>>> in_data = [[4,'blue','round',None], [6,'blue','round',None], [8,'red','round',None], [10,'red','round',None], [8,'red','square',None]]
>>> from collections import Counter
>>> counter = Counter()
>>> for count, *keys in in_data:
...     counter[tuple(keys)] += count
...     
>>> counter
Counter({('blue', 'round', None): 10,
         ('red', 'round', None): 18,
         ('red', 'square', None): 8})

很容易转换回您要求的输出格式：

>>> [[count, *keys] for keys, count in counter.items()]
[[8, 'red', 'square', None],
 [18, 'red', 'round', None],
 [10, 'blue', 'round', None]]

我同意评论员的意见，你可以使用比列表更好的数据结构。

我的示例使用了一些特定于python3.5的语法，如果您使用的是旧版本，则应该更喜欢2ps中实现相同想法的答案。

Answer 3

粗略地说，您所做的只是将元素0与定义为任何给定列表的最后三个元素的元组相关联。幸运的是，将元组作为字典的关键是完全可以的！

from collections import OrderedDict
color_data = [[4,'blue','round','none'], [6,'blue','round','none'], [8,'red','round','none'], [10,'red','round','none'], [8,'red','square','none']]
data = OrderedDict()
for x in color_data:
    key = tuple(x[1:])
    value = data.setdefault(key, 0) + x[0]
    data[key] = value
color_data = [ [ value] + list(key) for key, value in data.items() ]

Answer 4

如果你想使用熊猫，你可以这样做：

import pandas

data = [
    [4,"blue","round","none"],
    [6,"blue","round","none"],
    [8,"red","round","none"],
    [10,"red","round","none"],
    [8,"red","square","none"]
]

summary = (
    pandas.DataFrame(data, columns=['q', 'color', 'shape', 'other'])
        .groupby(by=['color', 'shape'], as_index=False)
        .agg({'q': 'sum', 'other': 'first'})
        .reset_index(drop=True)
)

print(summary)

  color   shape   q other
0  blue   round  10  none
1   red   round  18  none
2   red  square   8  none

Answer 5

所以有一个collections.Counter类可以非常方便地对键列表进行递增。

import collections

list_ = [
    [4, 'blue', 'round', None],
    [6, 'blue', 'round', None],
    [8, 'red', 'round', None],
    [10, 'red', 'round', None],
    [8, 'red', 'square', None],
]

counts = collections.Counter()

for i in list_:
    count = i[0]
    key = tuple(i[1:4])  # ex. ('blue', 'round', None) --> 6
    counts[key] += count

for i in counts.items():
    print(i)

输出：

(('blue', 'round', None), 10)
(('red', 'round', None), 18)
(('red', 'square', None), 8)

您可以轻松调整此格式，使其完全符合您的要求。

注释

首先使用+=而不首先初始化密钥不是拼写错误。因为Counter实例会自动为未定义的键返回值0。
我在原始问题中对变量名list_使用了list，因为将其命名为掩盖the list(...) built-in。
插入使用元组，因为它是hashable是字典键，而列表不是因为列表是可变的。
适用于Python 2/3。

Answer 6

如果a是您的列表列表，则：

import collections

d = collections.defaultdict(int)
for row in a:
    key = tuple(row[1:])
    d[key] += row[0]

e = [list((val,) + key) for key, val in d.items()]

输出：

In [14]: e
Out[14]: 
[[18, 'red', 'round', None],
 [8, 'red', 'square', None],
 [10, 'blue', 'round', None]]

python根据内容组合列表中的项目

6 个答案: