我无法在朱莉娅找到一个已经制作的功能来计算Pearson的r,所以我试图自己制作但是我遇到了麻烦。
代码:
r(x,y) = (sum(x*y) - (sum(x)*sum(y))/length(x))/sqrt((sum(x^2)-(sum(x)^2)/length(x))*(sum(y^2)-(sum(y)^2)/length(x)))
如果我尝试在两个数组上运行它:
b = [4,8,12,16,20,24,28]
q = [5,10,15,20,25,30,35]
我收到以下错误:
ERROR: `*` has no method matching *(::Array{Int64,1}, ::Array{Int64,1})
in r at none:1
答案 0 :(得分:9)
Pearson的r在Julia中可用cor
:
julia> cor(b,q)
1.0
当你在Julia中寻找函数时,apropos
函数会非常有用:
julia> apropos("pearson")
Base.cov(v1[, v2][, vardim=1, corrected=true, mean=nothing])
Base.cor(v1[, v2][, vardim=1, mean=nothing])
您在定义中遇到的问题是元素乘法/取幂与矩阵乘法/取幂之间的区别。为了按照您的意图使用元素行为,您需要.*
和.^
:
r(x,y) = (sum(x.*y) - (sum(x)*sum(y))/length(x))/sqrt((sum(x.^2)-(sum(x)^2)/length(x))*(sum(y.^2)-(sum(y)^2)/length(x)))
只有这三项更改,您的r
定义似乎与Julia的cor
匹配,只有少数ULP:
julia> cor(b,q)
1.0
julia> x,y = randn(10),randn(10)
([-0.2384626335813905,0.0793838075714518,2.395918475924737,-1.6271954454542266,-0.7001484742860653,-0.33511064476423336,-1.5419149314518956,-0.8284664940238087,-0.6136547926069563,-0.1723749334766532],[0.08581770755520171,2.208288163473674,-0.5603452667737798,-3.0599443201343854,0.585509815026569,0.3876891298047877,-0.8368409374755644,1.672421071281691,0.19652240951291933,0.9838306761261647])
julia> r(x,y)
0.23514468093214283
julia> cor(x,y)
0.23514468093214275
Julia的cor
is defined iteratively(这是零均值实现 - 调用cor
首先减去均值,然后调用corzm
),这意味着更少的分配和更好的性能。我不能说数字的准确性。
答案 1 :(得分:3)
您的函数正在尝试将两个列向量相乘。您需要反转转置其中一个。考虑:
> [1,2]*[3,4]
ERROR: `*` has no method matching *(::Array{Int64,1}, ::Array{Int64,1})
但:
> [1,2]'*[3,4]
1-element Array(Int64,1)
11
和
> [1,2]*[3,4]'
2x2 Array(Int64,2):
3 4
6 8