我正在尝试获取包含订单中每个项目的列表。我的数据是每行一个订单的格式,可能的项目为列,而每个项目的编号为值。
我已经想出了一种方法来处理独特的物品,但是如果重复的物品被多次包含,我真的很喜欢它。这是一个示例:
import pandas as pd
# Example dataframe
data = {'Egg':[0, 2, 1], 'Toast':[2, 2, 1]}
breakfast = pd.DataFrame(data)
# Cycle through columns and replace numbers with food words
value_cols = list(breakfast)
for food in value_cols:
breakfast.loc[breakfast[food] != 0, food] = food
# Create a list of foods
list_of_foods = breakfast.values.tolist()
# Remove empty values
list_of_foods = [[x for x in y if x != 0] for y in list_of_foods]
这给出了这样的列表列表:
[['Toast'], ['Egg', 'Toast'], ['Egg', 'Toast']]
但是,我真的想要一个这样的列表列表:
[['Toast', 'Toast'], ['Egg', 'Egg', 'Toast', 'Toast'], ['Egg', 'Toast']]
我真的不知道如何实现这一目标。我想知道重复行中是否有重复项,但是我也会以我认为的相同顺序重复非重复项。有人有什么想法吗?
答案 0 :(得分:2)
想法按每一行循环,按列名压缩,并使用平坦的嵌套列表重复值:
list_of_foods = [[c for a, b in zip(v, breakfast.columns) for c in [b] * a]
for v in breakfast.values]
print (list_of_foods)
[['Toast', 'Toast'], ['Egg', 'Egg', 'Toast', 'Toast'], ['Egg', 'Toast']]
答案 1 :(得分:1)
它当然不是很漂亮,但是我认为它可以工作:
require 'erb'
greetings = ['Hello World', 'Hello Earth', 'Hello Mars']
body = ERB.new(
<<-html
<html>
<body>
<ul>
<% greetings.each do |greeting| %>
<li><%= greeting %></li>
<% end %>
</ul>
</body>
</html>
html
).result(binding)
puts body
# <html>
# <body>
# <ul>
#
# <li>Hello World</li>
#
# <li>Hello Earth</li>
#
# <li>Hello Mars</li>
#
# </ul>
# </body>
# </html>
这给了我:
data = {'Egg':[0, 2, 1], 'Toast':[2, 2, 1]} # keys are dishes, values are frequencies
out = []
for i in range(len(list(data.values())[0])): # iterate over number of orders (num of frequencies)
out.append([]) # new list for each order
for key in data.keys(): # iterate overy dishes
out[i].extend([key for i in range(data[key][i]) ]) # replicate dish a given amount of frequencies
将其封装到函数中,然后就可以了
答案 2 :(得分:1)
代码
breakfast.apply(lambda x: list(x.index.repeat(x)), axis=1).tolist()
输出
[['Toast', 'Toast'], ['Egg', 'Egg', 'Toast', 'Toast'], ['Egg', 'Toast']]