使用类似于下面的数据框,我想找出变量的相关性。
使用R,我可以这样做:
library(PerformanceAnalytics)
month <- c("2017-01","2017-02","2017-03","2017-04","2017-05","2017-06","2017-07","2017-08","2017-09","2017-10","2017-11","2017-12")
Spending_1 <- c(66.5,66.5,38.5,49.5,66.5,66.5,38.5,49.5,32.5,32.5,32.5,32.5)
Spending_2 <- c(35,25,25,25,39,36,36,48,48,48,39,39)
Shipping_mall_1 <- c(17,50,42,51,76,17,65,57,54,32,51,81)
Shipping_mall_2 <- c(66,51,90,67,68,45,23,48,37,33,45,96)
df <- data.frame(month, Spending_1, Spending_2, Shipping_mall_1, Shipping_mall_2)
chart.Correlation(df[2:5])
它生成一个显示一对一关系的图表。
实际上我不需要同一组变量之间的相关性,例如不需要计算Spending_1到Spending_2(-0.37),以及Shipping_mall_1到Shipping_mall_2(0.22)。我只想看看:
。将Spending_1发送至Shipping_mall_1
。将Spending_1发送至Shipping_mall_2
。将2付款至Shipping_mall_1
。将2付款至Shipping_mall_2
因为原始数据帧非常宽,所以我希望忽略同一组/类型中变量之间的相关性。 (不一定用库(PerformanceAnalytics),其他库也可以)
谢谢。