目前我有这样的数据:
Item Properties
A C001
A C002
A C003
B C001
B C003
C C001
我想将这些项目分组为这样的
A C001, C002, C003
B C001, C003
C C001
然后,我希望根据属性相似性匹配这些项目:
A B 2
A C 1
B C 1
如何使用pandas修改此数据框?我确实使用了groupby方法,但它显示了属性数而不是属性名称数组。
答案 0 :(得分:1)
import pandas as pd
selfjoin = pd.merge(df, df, on = 'Property')
similarity = selfjoin.groupby(('Item_x', 'Item_y'), as_index=False).size()