Question

嗨，我的代码的目的是在列表列表中查找平均值和方差。约束条件是：如果在“线排序”中有两个或多个列表，且前两个相等元素计算添加列表的第三个元素的平均值。我的问题是包括方差和平均值的计算，并返回包含[a，b，均值，方差]的列表。提前非常感谢您。

linesort = [[1, 2, 3.00], [1, 2, 5.00], [1, 4, 7.00], [1, 4, 3.00] ,[3, 6, 5.2]]
new = []
final = []
count=0
for el in linesort:
    new.append(el[:-1])

tnew = [tuple(t) for t in new]
setnew = set(tnew)
setnew = [list(t) for t in setnew]

for items in setnew:
    inds = [i for i,x in enumerate(new) if x == items]
    if len(inds) > 1:
        somma = 0
        for ind in inds:
            print(somma)
            somma = linesort[ind][2] + somma
        media = somma/len(inds)
        items.append(media)
        final.append(items)
print(final)

所需的输出：

('Output: ', [[1, 2, 4.0,1.0], [1, 4, 5.0,4.0]])

关于差异，我想到了这一行代码，但无法使其正常工作。

variance = float(sum((linesort[ind][2] - media) ** 2 for linesort[ind][2] in linesort) / len(linesort))

Answer 1

您可以通过以下方式来简化代码：首先将dict中的数据重新组织，以前两个元素的元组作为键，并在列表中使用相应的值。

您可以使用defaultdict简化操作。

然后，我们只需要计算每个列表的均值和方差即可。

from collections import defaultdict

linesort = [[1, 2, 3.00], [1, 2, 5.00], [1, 4, 7.00], [1, 4, 3.00] ,[3, 6, 5.2]]

# Let's first group the values: 

d = defaultdict(list)
for x, y, val in linesort:
    d[(x, y)].append(val)

# d will be: {(1, 2): [3.0, 5.0], (1, 4): [7.0, 3.0], (3, 6): [5.2]}    
# Now we can build the output list:

out = []
for (x, y), values in d.items():
    n = len(values)
    mean = sum(values)/n
    variance = sum(x**2 for x in values)/n - mean**2
    out.append([x, y, mean, variance])

print(out)
# [[1, 2, 4.0, 1.0], [1, 4, 5.0, 4.0], [3, 6, 5.2, 0.0]]

回答您的评论：

如果要省略只有一个值的情况，只需将最后一部分更改为：

for (x, y), values in d.items():
    n = len(values)
    if n > 1:
        mean = sum(values)/n
        variance = sum(x**2 for x in values)/n - mean**2
        out.append([x, y, mean, variance])

如何计算方差？

1 个答案: