我在搜索组合频率的正确解决方案时遇到问题。
这是我的代码:
import pandas as pd
import itertools
list = [1,20,1,50]
combinations = []
for i in itertools.combinations(list ,2):
combinations .append(i)
data = pd.DataFrame({'products':combinations})
data['frequency'] = data.groupby('products')['products'].transform('count')
print data
The out is:
products frequency
0 (1, 20) 1
1 (1, 1) 1
2 (1, 50) 2
3 (20, 1) 1
4 (20, 50) 1
5 (1, 50) 2
问题是(1,20)和(20,1),频率放1但是组合相同,必须是2,有没有正确解法的方法?
答案 0 :(得分:0)
您可以使用applyand lambda
对列进行修改来使用groupimport pandas as pd
import itertools
list = [1,20,1,50]
combinations = []
for i in itertools.combinations(list ,2):
combinations .append(i)
data = pd.DataFrame({'products':combinations})
data['frequency'] = data.groupby(data['products'].apply(
lambda i :tuple(sorted(i))))['products'].transform('count')
print (data)
输出
products frequency
0 (1, 20) 2
1 (1, 1) 1
2 (1, 50) 2
3 (20, 1) 2
4 (20, 50) 1
5 (1, 50) 2