假设我们有两个列表Purchase
和Product
Purchase = [
['James', 'Shoes', 1],
['James', 'T-shirt', 3],
['James', 'Pants', 2],
['James', 'Jacket', 1],
['James', 'Bag', 1],
['Neil', 'Shoes', 2],
['Neil', 'Bag', 1],
['Neil', 'Jacket', 1],
['Neil', 'Pants', 1],
['Chris', 'Hats', 1],
['Chris', 'T-shirt', 2],
['Chris', 'Shoes', 1],
['Chris', 'Pants', 2],
]
Product = [
['T-shirt', 110],
['Pants', 150],
['Shoes', 200],
['Hats', 150],
['Jacket', 250],
['Bag', 230],
]
在Purchase
上,每个元素的第一个元素是买方的名称,第二个元素是他们购买的产品,最后一个元素是他们购买的数量。
在Product
上,其产品名称和价格
我想做的是根据对每种产品购买的每个购买者的计算来创建一个新列表,并将其从最高到最低排序,并且仅排在前3位。 如果有一个产品没有购买,它将乘以零。 为了便于理解,这里是计算:
For 'James': So the prices from expensive to cheap:
T-shirt -> 110*3 = 330 ['T-shirt', 'Pants', 'Jacket', 'Bag', 'Shoes', 'Hats']
Pants -> 150*2 = 300
Shoes -> 200*1 = 200
Hats -> 150*0 = 0
Jacket -> 250*1 = 250
Bag -> 230*1 = 230
For 'Neil':
T-shirt -> 110*0 = 0 ['Shoes', 'Jacket', 'Bag', 'Pants', 'T-shirt', 'Hats' ]
Pants -> 150*1 = 150
Shoes -> 200*2 = 400
Hats -> 150*0 = 0
Jacket -> 250*1 = 250
Bag -> 230*1 = 230
For 'Chris':
T-shirt -> 110*2 = 220 ['Pants', 'T-shirt', 'Shoes', 'Hats', 'Jacket', 'Bag']
Pants -> 150*2 = 300
Shoes -> 200*1 = 200
Hats -> 150*1 = 150
Jacket -> 250*0 = 0
Bag -> 230*0 = 0
所以最后这就是我的期望:
Result = [
['James', 'T-shirt', 'Pants', 'Jacket'],
['Neil', 'Shoes','Jacket', 'Bag'],
['Chris', 'Pants', 'T-shirt', 'Shoes']]
任何人都非常感谢
答案 0 :(得分:3)
有很多方法,但是这是我想到的第一个方法。我认为,扁平化的方法比长列表的理解方式更易于理解和维护(尽管目前的其他答案很聪明且简短)。
首先,您似乎想保留名称的出现顺序。我认为字典是处理此类链接的一种自然方法,因此要保留排序顺序,我个人将使用有序字典来寻求解决方案。此外,当您根据键值映射中有效的键查找内容时,Product
的使用更加容易。因此,我们执行以下操作:
from collections import OrderedDict
Product_kv = dict(Product)
从那里开始,我们遍历所有采购并维护每个商品上花费多少的映射。
d = OrderedDict()
for person, item, n in Purchase:
if person not in d:
d[person] = {}
if item not in d[person]:
d[person][item] = 0
d[person][item] += n*Product_kv[item]
如果计数或价格为负数,则不一定是正确的解决方案。根据要求,我们可以考虑乘以0而不会大张旗鼓:
for person in d:
for item in Product_kv:
if item not in d[person]:
d[person][item] = 0
剩下的就是使用预先计算的总支出提取您想要的排序数据。
[[name]+sorted(d[name], key=lambda s:d[name][s], reverse=True)[:3] for name in d]
答案 1 :(得分:0)
纯Python方法将涉及字典和显式迭代。如果您愿意使用第三方库,则可以使用熊猫:
import pandas as pd
# construct dataframe and series mapping
purchases = pd.DataFrame(Purchase)
products = pd.DataFrame(Product).set_index(0)[1]
# calculate value and sort
df = purchases.assign(value=purchases[2]*purchases[1].map(products))\
.sort_values('value', ascending=False)
# create dictionary or list result
res1 = {k: v[1].iloc[:3].tolist() for k, v in df.groupby(0, sort=False)}
res2 = [[k] + v[1].iloc[:3].tolist() for k, v in df.groupby(0, sort=False)]
结果:
print(res1)
{'Neil': ['Shoes', 'Jacket', 'Bag'],
'James': ['T-shirt', 'Pants', 'Jacket'],
'Chris': ['Pants', 'T-shirt', 'Shoes']}
print(res2)
[['Neil', 'Shoes', 'Jacket', 'Bag'],
['James', 'T-shirt', 'Pants', 'Jacket'],
['Chris', 'Pants', 'T-shirt', 'Shoes']]
答案 2 :(得分:0)
如果您的应用程序必须增长并处理许多数据,我还建议使用pandas。这是我的版本,似乎很长,但我认为用英语功能名称理解并不难
Purchase = [
['James', 'Shoes', 1],
['James', 'T-shirt', 3],
['James', 'Pants', 2],
['James', 'Jacket', 1],
['James', 'Bag', 1],
['Neil', 'Shoes', 2],
['Neil', 'Bag', 1],
['Neil', 'Jacket', 1],
['Neil', 'Pants', 1],
['Chris', 'Hats', 1],
['Chris', 'T-shirt', 2],
['Chris', 'Shoes', 1],
['Chris', 'Pants', 2],
]
Product = [
['T-shirt', 110],
['Pants', 150],
['Shoes', 200],
['Hats', 150],
['Jacket', 250],
['Bag', 230],
]
import pandas as pd
dfPurchase = pd.DataFrame(data=Purchase, columns=['buyer', 'product', 'count'])
print(dfPurchase)
print('\n')
dfProduct = pd.DataFrame(data=Product, columns=['product', 'price'])
print(dfProduct)
print('\n')
dfPurchased = dfPurchase.merge(dfProduct, on='product')
print(dfPurchased)
print('\n')
dfPurchased['priceXcount'] = dfPurchased['price'] * dfPurchased['count']
print(dfPurchased)
print('\n')
lstBuyer = dfPurchased['buyer'].unique()
lstResult = []
for buyer in lstBuyer:
lstTmp = [buyer]
dfOneBuyerPurchased = dfPurchased[dfPurchased['buyer'] == buyer]
# or you can use:
# dfOneBuyerPurchased = dfPurchased.query('buyer == "%s"' % buyer)
lstTmp += dfOneBuyerPurchased.sort_values(
by='priceXcount', ascending=False
)['product'].tolist()[:3]
lstResult.append(lstTmp)
print(lstResult)
输出为
buyer product count
0 James Shoes 1
1 James T-shirt 3
2 James Pants 2
3 James Jacket 1
4 James Bag 1
5 Neil Shoes 2
6 Neil Bag 1
7 Neil Jacket 1
8 Neil Pants 1
9 Chris Hats 1
10 Chris T-shirt 2
11 Chris Shoes 1
12 Chris Pants 2
product price
0 T-shirt 110
1 Pants 150
2 Shoes 200
3 Hats 150
4 Jacket 250
5 Bag 230
buyer product count price
0 James Shoes 1 200
1 Neil Shoes 2 200
2 Chris Shoes 1 200
3 James T-shirt 3 110
4 Chris T-shirt 2 110
5 James Pants 2 150
6 Neil Pants 1 150
7 Chris Pants 2 150
8 James Jacket 1 250
9 Neil Jacket 1 250
10 James Bag 1 230
11 Neil Bag 1 230
12 Chris Hats 1 150
buyer product count price priceXcount
0 James Shoes 1 200 200
1 Neil Shoes 2 200 400
2 Chris Shoes 1 200 200
3 James T-shirt 3 110 330
4 Chris T-shirt 2 110 220
5 James Pants 2 150 300
6 Neil Pants 1 150 150
7 Chris Pants 2 150 300
8 James Jacket 1 250 250
9 Neil Jacket 1 250 250
10 James Bag 1 230 230
11 Neil Bag 1 230 230
12 Chris Hats 1 150 150
[['James', 'T-shirt', 'Pants', 'Jacket'], ['Neil', 'Shoes', 'Jacket', 'Bag'], ['Chris', 'Pants', 'T-shirt', 'Shoes']]
答案 3 :(得分:0)
使用熊猫和字典。
purch_df = pd.DataFrame(Purchase, columns = ['name','product','count'])
d = dict(Product)
创建新的“价格”列,然后执行计算并保存到新的“总计”列
purch_df['price'] = [d[product] for product in purch_df['product']]
purch_df['total'] = purch_df['count'] * purch_df['price']
创建字典以将分组的数据帧保存到以供将来查找
d2 = {}
for group, frame in purch_df.groupby('name'):
d2[group] = list(frame.sort_values('total', ascending = False).iloc[:3,1])
从字典d2中提取所需列表
Result = [lst for _, lst in d2.items()]
答案 4 :(得分:-1)
您可以为Product
列表以及每个用户的项目创建字典,以便于查找:
from itertools import groupby
p = dict(Product)
data = [[a, list(b)] for a, b in groupby(Purchase, key=lambda x:x[0])]
new_results = [(lambda x, y:[x, [[c, y.get(c, 0)*b] for c, b in p.items()]])(a, dict([h[-2:] for h in b])) for a, b in data]
new_sorted = [[a, *[i[0] for i in sorted(b, key=lambda x:x[-1], reverse=True)][:3]] for a, b in new_results]
输出:
[['James', 'T-shirt', 'Pants', 'Jacket'],
['Neil', 'Shoes', 'Jacket', 'Bag'],
['Chris', 'Pants', 'T-shirt', 'Shoes']]
答案 5 :(得分:-2)
您可以对itertools.groupby
使用以下列表理解:
from itertools import groupby
from operator import itemgetter
Result = [[k, *map(itemgetter(1), sorted((-p[i] * c, i) for _, i, c in g)[:3])] for p in (dict(Product),) for k, g in groupby(Purchase, key=itemgetter(0))]
使用示例输入,Result
将变为:
[['James', 'T-shirt', 'Pants', 'Jacket'], ['Neil', 'Shoes', 'Jacket', 'Bag'], ['Chris', 'Pants', 'T-shirt', 'Shoes']]
上面的列表理解只是下面等效代码的更简洁版本:
# convert the product pricing into a product-to-price dict for efficient lookup
pricing = dict(Product)
Result = []
# extract the groupings in Purchase based on the first item, the customer's name
for name, purchases in groupby(Purchase, key=itemgetter(0)):
costs = []
# for each of a customer's purchases, we calculate the cost by multiplying
# the product's pricing by the number purchased, and put the calculated cost
# and product name in a tuple so that it can be sorted by the cost first and
# then the customer name second; the cost should be negated so to sort
# in descending order
for _, product, count in purchases:
costs.append((-pricing[product] * count, product))
costs.sort()
# initialize the sub-list in the output, which starts with the customer's name
top_products = [name]
# followed by the top 3 products from the second item in the sorted costs list
for _, product in costs[:3]:
top_products.append(product)
# we've got a finished sub-list to output for the current customer
Result.append(top_products)