我正在尝试计算两个时间序列之间的相关性。我尝试了以下代码
time1 = np.arange(0,1000,1).reshape((-1,1))
slope1 = 15
slope2 = 3
amp=1000
line1 = time1*slope1+amp
line2=time1*(0.5)+amp/10
corr=np.corrcoef(x=line1,y=line2,rowvar = False)
输出为
corr = [[1. 1.][1. 1.]]
我曾预计,由于两条线的斜率不同,相关性将远小于1。为什么相关性显示为1?
答案 0 :(得分:1)
尽管斜率非常不同,但您可以将相关性视为忽略比例并寻找行进方向的事物。当您的一个变量的数量增加<plugin>
<groupId>ru.trylogic.maven.plugins</groupId>
<artifactId>redis-maven-plugin</artifactId>
<version>1.4.6</version>
<configuration>
<forked>true</forked>
</configuration>
<executions>
<execution>
<id>launch-redis</id>
<phase>pre-integration-test</phase>
<goals>
<goal>run</goal>
</goals>
</execution>
<execution>
<id>stop-redis</id>
<phase>post-integration-test</phase>
<goals>
<goal>shutdown</goal>
</goals>
</execution>
</executions>
</plugin>
时,另一个变量的数量增加<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>1.4.0</version>
<executions>
<execution>
<id>launch-redis</id>
<phase>pre-integration-test</phase>
<goals>
<goal>exec</goal>
</goals>
<configuration>
<executable>redis-server</executable>
<arguments>
<argument>${project.basedir}/src/test/redis/redis.conf</argument>
<argument>--port</argument>
<argument>${redisPort}</argument>
</arguments>
</configuration>
</execution>
<execution>
<id>shutdown-redis</id>
<phase>post-integration-test</phase>
<goals>
<goal>exec</goal>
</goals>
<configuration>
<executable>redis-cli</executable>
<arguments>
<argument>-p</argument>
<argument>${redisPort}</argument>
<argument>shutdown</argument>
</arguments>
</configuration>
</execution>
,其中x1
是一个常数,因此它们具有完美的相关性(它们总是相对于一个具有相同的行为另一个)。
答案 1 :(得分:1)
如果您要像Excel的R ^ 2中那样表示相关性,则可以使用类似的东西(已经用于我的工作了):
def correlation(Measure, Fit):
"""Calculates the correlation coefficient R^2 between the two sets
of Y data provided. Logically, in order for the result to have a sense
you want both Y arrays to have been created from the same X array."""
Mean = np.mean(Measure)
s1 = 0
s2 = 0
Size = np.size(Measure) # identical to np.size(Fit)
for i in range(0, Size):
s1 += (Measure[i] - Fit[i]) ** 2
s2 += (Measure[i] - Mean) ** 2
Rsquare = 1 - s1/s2
return Rsquare
为了便于阅读,我删除了它们,但是您可以用各种预防措施和错误消息来包围它们,例如,当两个数组的大小不同或包含NAN时。