这是我的输入文件:
inputfile_pd=pd.DataFrame([['2018-02-02',10, 2], ['2018-02-02',1, 3], ['2018-02-02',3, 4], ['2018-02-03',3, 2], ['2018-02-03',2, 3], ['2018-02-03',4, 4], ['2018-02-04',4, 3],['2018-02-04',1, 4]], columns=['DateOfSale','Sales','Client_id'])
因此它看起来像:
DateOfSale Sales Client_id
0 2018-02-02 10 2
1 2018-02-02 1 3
2 2018-02-02 3 4
3 2018-02-03 3 2
4 2018-02-03 2 3
5 2018-02-03 4 4
6 2018-02-04 4 3
7 2018-02-04 1 4
计算此表中具有各种ID的客户的销售相关矩阵的最简单方法是什么?
我正在寻找的答案可能看起来像这样
Client2_sales Client3_sales Client4_sales
Client2_sales some val some val some val
Client3_sales some val some val some val
Client4_sales some val some val some val
答案 0 :(得分:0)
像这样吗?
inputfile_pd.pivot('DateOfSale','Client_id').corr()
Sales
Client_id 2 3 4
Client_id
Sales 2 1.0 -1.000000 -1.000000
3 -1.0 1.000000 -0.785714
4 -1.0 -0.785714 1.000000