我有一个快速的(希望会计问题。我刚刚进入一个新工作,书籍有点乱。书籍记录了这些一次性付款,而银行帐户列出了每一笔存款。我需要确定哪些存款属于账簿中的每笔存款。所以,我有这四笔一笔款项:
[6884.41,14382.14,2988.11,8501.60]
然后我有了这个更大的个人存款清单(已分类):
[98.56,98.56,98.56,129.44,160.0,242.19,286.87,290.0,351.01,665.0,675.0,675.0,677.45,677.45,695.0,695.0,695.0,695.0,715.0,720.0,725.0,730.0,745.0, 745.0,750.0,750.0,750.0,750.0,758.93,758.93,763.85,765.0,780.0,781.34,781.7,813.79,824.97,827.05,856.28,874.08,874.44,1498.11,1580.0,1600.0,1600.0]
在Python中,如何确定较长列表的哪个子集总和为一个总和值? (注意:这些数字有一个额外的问题,即一次性总和的总和比个人账户的总和多732.70美元。我希望这不会使这个问题完全无法解决)
答案 0 :(得分:0)
这是一个很好的解决方案:
import datetime as dt
from itertools import groupby
from math import ceil
def _unique_subsets_which_sum_to(target, value_counts, max_sums, index):
value, count = value_counts[index]
if index:
# more values to be considered; solve recursively
index -= 1
rem = max_sums[index]
# find the minimum amount that this value must provide,
# and the minimum occurrences that will satisfy that value
if target <= rem:
min_k = 0
else:
min_k = (target - rem + value - 1) // value # rounded up to next int
# find the maximum occurrences of this value
# which result in <= target
max_k = min(count, target // value)
# iterate across min..max occurrences
for k in range(min_k, max_k+1):
new_target = target - k*value
if new_target:
# recurse
for solution in _unique_subsets_which_sum_to(new_target, value_counts, max_sums, index):
yield ((solution + [(value, k)]) if k else solution)
else:
# perfect solution, no need to recurse further
yield [(value, k)]
else:
# this must finish the solution
if target % value == 0:
yield [(value, target // value)]
def find_subsets_which_sum_to(target, values):
"""
Find all unique subsets of values which sum to target
target integer >= 0, total to be summed to
values sequence of integer > 0, possible components of sum
"""
# this function is basically a shell which prepares
# the input values for the recursive solution
# turn sequence into sorted list
values = sorted(values)
value_sum = sum(values)
if value_sum >= target:
# count how many times each value appears
value_counts = [(value, len(list(it))) for value,it in groupby(values)]
# running total to each position
total = 0
max_sums = [0]
for val,num in value_counts:
total += val * num
max_sums.append(total)
start = dt.datetime.utcnow()
for sol in _unique_subsets_which_sum_to(target, value_counts, max_sums, len(value_counts) - 1):
yield sol
end = dt.datetime.utcnow()
elapsed = end - start
seconds = elapsed.days * 86400 + elapsed.seconds + elapsed.microseconds * 0.000001
print(" -> took {:0.1f} seconds.".format(seconds))
# I multiplied each value by 100 so that we can operate on integers
# instead of floating-point; this will eliminate any rounding errors.
values = [
9856, 9856, 9856, 12944, 16000, 24219, 28687, 29000, 35101, 66500,
67500, 67500, 67745, 67745, 69500, 69500, 69500, 69500, 71500, 72000,
72500, 73000, 74500, 74500, 75000, 75000, 75000, 75000, 75893, 75893,
76385, 76500, 78000, 78134, 78170, 81379, 82497, 82705, 85628, 87408,
87444, 149811, 158000, 160000, 160000
]
sum_to = [
298811,
688441,
850160 #,
# 1438214
]
def main():
subset_sums_to = []
for target in sum_to:
print("\nSolutions which sum to {}".format(target))
res = list(find_subsets_which_sum_to(target, values))
print(" {} solutions found".format(len(res)))
subset_sums_to.append(res)
return subset_sums_to
if __name__=="__main__":
subsetsA, subsetsB, subsetsC = main()
在我的机器上导致
Solutions which sum to 298811
-> took 0.1 seconds.
2 solutions found
Solutions which sum to 688441
-> took 89.8 seconds.
1727 solutions found
Solutions which sum to 850160
-> took 454.0 seconds.
6578 solutions found
# Solutions which sum to 1438214
# -> took 7225.2 seconds.
# 87215 solutions found
下一步是交叉比较解决方案子集,看看哪些可以共存。我认为最快的方法是存储最小的三个总和的子集,迭代它们(对于兼容的组合)找到剩余的值并将它们插入到最后一次总和的求解器中。
继续我离开的地方(对上面的代码进行一些更改,以获取前三个值的子列表的返回列表)。
我希望每次都能轻松获得剩余的值系数;
class NoNegativesDict(dict):
def __sub__(self, other):
if set(other) - set(self):
raise ValueError
else:
res = NoNegativesDict()
for key,sv in self.iteritems():
ov = other.get(key, 0)
if sv < ov:
raise ValueError
# elif sv == ov:
# pass
elif sv > ov:
res[key] = sv - ov
return res
然后我将其应用为
value_counts = [(value, len(list(it))) for value,it in groupby(values)]
vc = NoNegativesDict(value_counts)
nna = [NoNegativesDict(a) for a in subsetsA]
nnb = [NoNegativesDict(b) for b in subsetsB]
nnc = [NoNegativesDict(c) for c in subsetsC]
# this is kind of ugly; with some more effort
# I could probably make it a recursive call also
b_tries = 0
c_tries = 0
sol_count = 0
start = dt.datetime.utcnow()
for a in nna:
try:
res_a = vc - a
sa = str(a)
for b in nnb:
try:
res_b = res_a - b
b_tries += 1
sb = str(b)
for c in nnc:
try:
res_c = res_b - c
c_tries += 1
#unpack remaining values
res_values = [val for val,num in res_c.items() for i in range(num)]
for sol in find_subsets_which_sum_to(1438214, res_values):
sol_count += 1
print("\n================")
print("a =", sa)
print("b =", sb)
print("c =", str(c))
print("d =", str(sol))
except ValueError:
pass
except ValueError:
pass
except ValueError:
pass
print("{} solutions found in {} b-tries and {} c-tries".format(sol_count, b_tries, c_tries))
end = dt.datetime.utcnow()
elapsed = end - start
seconds = elapsed.days * 86400 + elapsed.seconds + elapsed.microseconds * 0.000001
print(" -> took {:0.1f} seconds.".format(seconds))
和最终输出:
0 solutions found in 1678 b-tries and 93098 c-tries
-> took 73.0 seconds.
所以最终答案是您的给定数据没有解决方案。
希望有所帮助; - )