我在python中有这个csv文件
SHOP_ID, COST, ITEM
1, 2.00, A
1, 1.25, B
1, 2.00, C
1, 1.00, D
1, 1.00, "A, B"
1, 1.50, "A, C"
1, 2.50, "A, D"
2, 3.00, A
2, 1.00, B
2, 1.20, C
2, 1.25, D
我已将此文件作为python中的数据框读取。
现在假设我输入A,B,C,D作为输入,并希望从我的数据框中找到最便宜的ITEMS组合用于此用户输入,那么我应该得到: -
SHOP_ID=1
A,B(1.00)+A,C(1.50)+D(1.00) = 3.50
用户将获得A,A,B,C,D,即额外的A,但只要总费用最低,我们就不在乎用户是否将额外的物品作为免费赠品。
我不知道如何解决这个问题。任何帮助都会非常感激。
答案 0 :(得分:0)
这是一种方法:
def build_shops(shop_text):
shops = {}
for item_info in shop_text:
shop_id,cost,items = item_info.replace(' ', '').split(',')
cost = float(cost)
items = items.split('+')
if shop_id not in shops:
shops[shop_id] = {}
shop_dict = shops[shop_id]
for item in items:
if item not in shop_dict:
shop_dict[item] = []
shop_dict[item].append([cost,items])
return shops
def solve_one_shop(shop, items):
if len(items) == 0:
return [0.0, []]
for item in items:
if item not in shop:
return [float('inf'), []]
all_possible = []
first_item = items[0]
for (price,combo) in shop[first_item]:
sub_set = [x for x in items if x not in combo]
price_sub_set,solution = solve_one_shop(shop, sub_set)
solution.append([price,combo])
all_possible.append([price+price_sub_set, solution])
cheapest = min(all_possible, key=(lambda x: x[0]))
return cheapest
def solver(input_data, required_items):
shops = build_shops(input_data)
result_all_shops = []
for shop_id,shop_info in shops.iteritems():
(price, solution) = solve_one_shop(shop_info, required_items)
if price != float('inf'):
result_all_shops.append([shop_id, price, solution])
if len(result_all_shops) == 0:
print('No shop has all required items')
return
shop_id,total_price,solution = min(result_all_shops, key=(lambda x: x[1]))
print('SHOP_ID=%s' % shop_id)
sln_str = [','.join(items)+'(%0.2f)'%price for (price,items) in solution]
sln_str = '+'.join(sln_str)
print(sln_str + ' = %0.2f' % total_price)
测试:
input_data = [
'1, 2.00, A',
'1, 1.25, B',
'1, 2.00, C',
'1, 1.00, D',
'1, 1.00, A+B',
'1, 1.50, A+C',
'1, 2.50, A+D',
'2, 3.00, A',
'2, 1.00, B',
'2, 1.20, C',
'2, 1.25, D',
]
required_items = ['A','B','C','D']
solver(input_data, required_items)
输出:
SHOP_ID=1
D(1.00)+A,C(1.50)+A,B(1.00) = 3.50
请注意我使用:
1, 1.00, A+B
而不是
1, 1.00, "A, B"
作为输入格式,只是为了更容易格式化。您可以根据您的格式修改“build_shops”功能。
这个解决方案基本上做:选择项目'A',然后计算set的解('B','C','D')。为了计算解('B','C','D'),它选择'B'并计算集合('C','D')。这是一种分而治之的(http://en.wikipedia.org/wiki/Divide_and_conquer_algorithms)。关键代码是:
sub_set = [x for x in items if x not in combo]
price_sub_set,solution = solve_one_shop(shop, sub_set)
为了帮助理解代码,我在这里粘贴“build_shops”的输出:
{'1': {'A': [(2.0, ['A']),
(1.0, ['A', 'B']),
(1.5, ['A', 'C']),
(2.5, ['A', 'D'])],
'B': [(1.25, ['B']), (1.0, ['A', 'B'])],
'C': [(2.0, ['C']), (1.5, ['A', 'C'])],
'D': [(1.0, ['D']), (2.5, ['A', 'D'])]},
'2': {'A': [(3.0, ['A'])],
'B': [(1.0, ['B'])],
'C': [(1.2, ['C'])],
'D': [(1.25, ['D'])]}}
这个解决方案迭代所有可能的组合,这是蛮力。因此,如果数据集非常大,那么效率会不高。
测试案例2:
input_data = [
'1, 2.00, burger',
'1, 1.25, tofu',
'1, 2.00, tuna',
'1, 1.00, salad',
'1, 1.00, burger+tofu',
'1, 1.50, burger+tuna',
'1, 2.50, burger+salad',
'2, 3.00, burger',
'2, 1.00, tofu',
'2, 1.20, tuna',
'2, 1.25, salad',
]
required_items = ['burger','tofu','tuna','salad']
solver(input_data, required_items)
输出2:
SHOP_ID=1
salad(1.00)+burger,tuna(1.50)+burger,tofu(1.00) = 3.50