从某列获取相关值

时间:2017-10-28 04:46:43

标签: python pandas numpy dataframe pivot

让我们说ratings.head()

critic  title   rating
0   Jack Matthews   Lady in the Water   3.0
1   Jack Matthews   Snakes on a Plane   4.0
2   Jack Matthews   You Me and Dupree   3.5
3   Jack Matthews   Superman Returns    5.0
4   Jack Matthews   The Night Listener  3.0

我希望得到像

这样的相关值
title   Just My Luck    Lady in the Water   Snakes on a Plane   Superman Returns    The Night Listener  You Me and Dupree

Just My Luck    1.000000    -0.944911   -0.333333   -0.422890   0.555556    -0.485662
Lady in the Water   -0.944911   1.000000    0.577350    0.404226    NaN 0.333333
Snakes on a Plane   -0.333333   0.577350    1.000000    -0.101929   -0.408248   -0.645497
Superman Returns    -0.422890   0.404226    -0.101929   1.000000    -0.062500   0.657952
The Night Listener  0.555556    NaN -0.408248   -0.062500   1.000000    -0.250000
You Me and Dupree   -0.485662   0.333333    -0.645497   0.657952    -0.250000   1.000000
在python中的

我正在尝试使用数据透视表,但它从第一个表中删除了0,1,2,3,4。

如何使用pandas获取上述相关表?

1 个答案:

答案 0 :(得分:0)

使用pivot + corr

df = df.pivot(index='critic', columns='title', values='rating').corr()

替代unstack

df = df.set_index(['critic','title'])['rating'].unstack().corr()