我想知道什么是最有效(时间和内存)方式来计算子集的数量,其总和小于或等于某个限制。例如,对于集合{1, 2, 4}
和3
的限制,此类数字应为4(子集为{}, {1}, {2}, {1, 2}
)。我尝试在位向量(掩码)中编码子集并以下列方式查找答案(伪代码):
solve(mask, sum, limit)
if visited[mask]
return
if sum <= limit
count = count + 1
visited[mask] = true
for i in 0..n - 1
if there is i-th bit
sum = sum - array[i]
mask = mask without i-th bit
count (mask, sum, limit)
solve(2^n - 1, knapsack sum, knapsack limit)
数组是从零开始的,count可以是全局变量,visited
是一个长度为2^n
的数组。我知道这个问题具有指数级的复杂性,但是我的想法有更好的方法/改进吗?该算法快速运行n ≤ 24
,但我的方法相当暴力,我正在考虑存在一些聪明的方法来找到n = 30
的答案。
答案 0 :(得分:3)
对空间最有效的是对所有子集的递归遍历,只是保持计数。这将是O(2^n)
时间和O(n)
内存,其中n
是整个集合的大小。
所有已知的解决方案可以在时间上呈指数级,因为您的程序是subset-sum的变体。这已知是NP完整的。但是一个非常有效的DP解决方案如下所示,带有注释的伪代码。
# Calculate the lowest sum and turn all elements positive.
# This turns the limit problem into one with only non-negative elements.
lowest_sum = 0
for element in elements:
if element < 0:
lowest_sum += element
element = -element
# Sort and calculate trailing sums. This allows us to break off
# the details of lots of ways to be below our bound.
elements = sort elements from largest to smallest
total = sum(elements)
trailing_sums = []
for element in elements:
total -= element
push total onto trailing_sums
# Now do dp
answer = 0
ways_to_reach_sum = {lowest_sum: 1}
n = length(answer)
for i in range(0, n):
new_ways_to_reach_sum = {}
for (sum, count) in ways_to_reach_sum:
# Do we consider ways to add this element?
if bound <= elements[i] + sum:
new_ways_to_reach_sum[sum] += count
# Make sure we keep track of ways to not add this element
if bound <= sum + trailing_sums[i]:
# All ways to compute the subset are part of the answer
answer += count * 2**(n - i)
else:
new_ways_to_reach_sum[sum] += count
# And finish processing this element.
ways_to_reach_sum = new_ways_to_reach_sum
# And just to be sure
for (sum, count) in ways_to_reach_sum:
if sum <= bound:
answer += count
# And now answer has our answer!