我有这些代码,我需要创建一个与所附图片相似的数据框-谢谢
import pandas as pd
Product = [(100, 'Item1, Item2'),
(101, 'Item1, Item3'),
(102, 'Item4')]
labels = ['product', 'info']
ProductA = pd.DataFrame.from_records(Product, columns=labels)
Cust = [('A', 200),
('A', 202),
('B', 202),
('C', 200),
('C', 204),
('B', 202),
('A', 200),
('C', 204)]
labels = ['customer', 'product']
Cust1 = pd.DataFrame.from_records(Cust, columns=labels)
答案 0 :(得分:1)
merge
和get_dummies
dfA.merge(dfB).set_index('customer').tags.str.get_dummies(', ').sum(level=0,axis=0)
Out[549]:
chocolate filled glazed sprinkles
customer
A 3 1 0 2
C 1 0 2 1
B 2 2 0 0
答案 1 :(得分:0)
可以使用merge
,split
,melt
和concat
进行IIUC:
dfB = dfB.merge(dfA, on='product')
dfB = pd.concat([dfB.iloc[:,:-1], dfB.tags.str.split(',', expand=True)], axis=1)
dfB = dfB.melt(id_vars=['customer', 'product']).drop(columns = ['product', 'variable'])
dfB = pd.concat([dfB.customer, pd.get_dummies(dfB['value'])], axis=1)
dfB
输出:
customer filled sprinkles chocolate glazed
0 A 0 0 1 0
1 C 0 0 1 0
2 A 0 0 1 0
3 A 0 0 1 0
4 B 0 0 1 0
5 B 0 0 1 0
6 C 0 0 0 1
7 C 0 0 0 1
8 A 0 1 0 0
9 C 0 1 0 0
10 A 0 1 0 0
11 A 1 0 0 0
12 B 1 0 0 0
13 B 1 0 0 0