Question

我有一个数字列表（例如：[-1, 1, -4, 5]），我必须从列表中删除数字而不更改列表的总和。我想删除具有最大绝对值的数字，而不更改总数，在示例中删除[-1, -4, 5]将保留[1]，因此总和不会更改。

我编写了天真的方法，即找到所有可能的组合，这些组合不会改变总数，看看哪一个去除了最大的绝对值。但这确实很慢，因为实际列表会比那个大很多。

这是我的组合代码：

from itertools import chain, combinations

def remove(items):
    all_comb = chain.from_iterable(combinations(items, n+1) 
                                   for n in xrange(len(items)))
    biggest = None
    biggest_sum = 0
    for comb in all_comb:
        if sum(comb) != 0:
            continue # this comb would change total, skip
        abs_sum = sum(abs(item) for item in comb)
        if abs_sum > biggest_sum:
            biggest = comb
            biggest_sum = abs_sum
    return biggest

print remove([-1, 1, -4, 5])

它核心打印(-1, -4, 5)。但是，我正在寻找一些比循环所有可能的项目组合更聪明，更有效的解决方案。

有什么想法吗？

Answer 1

如果您将问题重新定义为查找其总和等于完整集的值的子集，您将意识到这是NP-Hard问题，（subset sum）

所以这个问题没有多项式复杂性解决方案。

Answer 2

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright © 2009 Clóvis Fabrício Costa
# Licensed under GPL version 3.0 or higher

def posneg_calcsums(subset):
    sums = {}
    for group in chain.from_iterable(combinations(subset, n+1) 
                                     for n in xrange(len(subset))):
        sums[sum(group)] = group
    return sums

def posneg(items):
    positive = posneg_calcsums([item for item in items if item > 0])
    negative = posneg_calcsums([item for item in items if item < 0])
    for n in sorted(positive, reverse=True):
        if -n in negative:
            return positive[n] + negative[-n]
    else:
        return None

print posneg([-1, 1, -4, 5])
print posneg([6, 44, 1, -7, -6, 19])

它工作得很好，并且很多比我的第一种方法更快。感谢Alon的维基百科链接和#python irc频道上的ivazquez |笔记本电脑提供了一个很好的提示，使我进入了解决方案。

我认为可以进一步优化 - 一旦找到解决方案，我想要一种方法来停止计算昂贵的部分。我会继续努力。

Answer 3

我没有使用Python编程，所以我抱歉不提供代码。但我想我可以帮助解决这个问题：

查找总和
添加最低值的数字，直到达到相同的金额
其他所有内容都可以删除

我希望这会有所帮助

Answer 4

您的要求没有说明是否允许该功能更改列表顺序。这是一种可能性：

def remove(items):
    items.sort()
    running = original = sum(items)
    try:
        items.index(original) # we just want the exception
        return [original]
    except ValueError:
        pass
    if abs(items[0]) > items[-1]:
        running -= items.pop(0)
    else:
        running -= items.pop()
    while running != original:
        try:
            running -= items.pop(items.index(original - running))
        except ValueError:
            if running > original:
                running -= items.pop()
            elif running < original:
                running -= items.pop(0)
    return items

这会对列表进行排序（大项目将在最后，较小的项目将在开头）并计算总和，并从列表中删除项目。然后继续删除项目，直到新总数等于原始总数。保留顺序的替代版本可以写为包装器：

from copy import copy

def remove_preserve_order(items):
    a = remove(copy(items))
    return [x for x in items if x in a]

如果您真的想保留订单，可能应该使用collections.deque重写此内容。如果您可以保证列表中的唯一性，则可以使用set获得巨大的胜利。

我们可能会编写一个更好的版本，遍历列表以找到每次最接近运行总计的两个数字，并删除两者中较近的一个，但那时我们可能最终得到O（N ^ 2）性能。我相信这段代码的性能将是O（N * log（N））因为它只需要对列表进行排序（我希望Python的列表排序不是O（N ^ 2））然后得到总和。

Answer 5

这可以使用整数编程来解决。您可以为每个列表元素x_i定义二进制变量s_i，并最小化\ sum_i s_i，受限于\ sum_i（x_i * s_i）等于列表的原始总和的约束。

以下是使用R中的lpSolve包的实现：

library(lpSolve)
get.subset <- function(lst) {
  res <- lp("min", rep(1, length(lst)), matrix(lst, nrow=1), "=", sum(lst),
            binary.vec=seq_along(lst))
  lst[res$solution > 0.999]
}

现在，我们可以通过几个例子来测试它：

get.subset(c(1, -1, -4, 5))
# [1] 1
get.subset(c(6, 44, 1, -7, -6, 19))
# [1] 44 -6 19
get.subset(c(1, 2, 3, 4))
# [1] 1 2 3 4

从列表中删除数字而不更改总和

5 个答案: