确定聚合数据和非聚合数据之间的相关性

时间:2018-12-07 23:12:03

标签: python pandas correlation

我有一个数据集,它实质上是一个由sql查询输出产生的列表的列表。这是它的样子

[[(datetime.datetime(2017, 12, 1, 0, 0), Decimal('7.9618320610687023')), (datetime.datetime(2018, 1, 1, 0, 0), Decimal('3.8426966292134831')), (datetime.datetime(2018, 2, 1, 0, 0), Decimal('4.4876543209876543')), (datetime.datetime(2018, 3, 1, 0, 0), Decimal('4.7269372693726937')), (datetime.datetime(2018, 4, 1, 0, 0), Decimal('5.3849765258215962')), (datetime.datetime(2018, 5, 1, 0, 0), Decimal('4.0217391304347826')), (datetime.datetime(2018, 6, 1, 0, 0), Decimal('4.1186440677966102')), (datetime.datetime(2018, 7, 1, 0, 0), Decimal('6.2187500000000000')), (datetime.datetime(2018, 8, 1, 0, 0), Decimal('3.2826086956521739')), (datetime.datetime(2018, 9, 1, 0, 0), Decimal('4.4661654135338346')), (datetime.datetime(2018, 10, 1, 0, 0), Decimal('4.9191176470588235')), (datetime.datetime(2018, 11, 1, 0, 0), Decimal('4.0491803278688525')), (datetime.datetime(2018, 12, 1, 0, 0), Decimal('5.3090909090909091'))], [(datetime.datetime(2017, 12, 1, 0, 0), 14.2151145038168), (datetime.datetime(2018, 1, 1, 0, 0), 12.9982584269663), (datetime.datetime(2018, 2, 1, 0, 0), 13.46), (datetime.datetime(2018, 3, 1, 0, 0), 13.0539852398524), (datetime.datetime(2018, 4, 1, 0, 0), 12.9493896713615), (datetime.datetime(2018, 5, 1, 0, 0), 13.115652173913), (datetime.datetime(2018, 6, 1, 0, 0), 12.8800564971751), (datetime.datetime(2018, 7, 1, 0, 0), 13.318125), (datetime.datetime(2018, 8, 1, 0, 0), 13.6523913043478), (datetime.datetime(2018, 9, 1, 0, 0), 14.0972180451128), (datetime.datetime(2018, 10, 1, 0, 0), 14.6723529411765), (datetime.datetime(2018, 11, 1, 0, 0), 14.936393442623), (datetime.datetime(2018, 12, 1, 0, 0), 15.9845454545455)]]

它基本上包含两个列表,每个列表都有一个日期和指标列。我需要提取每个列表的指标列值,并找到它们之间的相关性。

这里的两个指标是quantityunitprice,查询基本上是要找出monthly average quantity and unit price for the last 1 year

这是图的样子

enter image description here

这就是我要获取大熊猫的PearsonSpearman系数的方法

import pandas as pd
import datetime
from decimal import Decimal

# contains date and average quantity values
data1 = data[0]
# contains date and average unitprice values
data2 = data[1]

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)

pearson_coeff = df1.iloc[:,-1].astype('float64').corr(df2.iloc[:,-1].astype('float64'))

spearman_coeff = df1.iloc[:,-1].astype('float64').corr(df2.iloc[:,-1].astype('float64'),method="spearman", min_periods=1)

我得到的pearson_coeff值为0.3416,而spearman_coeff的值为0.2802

现在,我在某处读到,在聚合数据上找到相关关系并不是一个好主意。所以我所做的是对每个指标进行单独的sql查询,但是这次没有聚合。这是它的样子

[[(datetime.datetime(2017, 12, 1, 0, 0), 272), (datetime.datetime(2017, 12, 1, 0, 0), -16), (datetime.datetime(2017, 12, 1, 0, 0), 80), (datetime.datetime(2017, 12, 1, 0, 0), 38), (datetime.datetime(2017, 12, 1, 0, 0), -2), (datetime.datetime(2017, 12, 1, 0, 0), 79), (datetime.datetime(2017, 12, 1, 0, 0), -10), (datetime.datetime(2017, 12, 1, 0, 0), 12), (datetime.datetime(2017, 12, 1, 0, 0), 32), (datetime.datetime(2017, 12, 1, 0, 0), -1), (datetime.datetime(2017, 12, 1, 0, 0), 1), (datetime.datetime(2017, 12, 1, 0, 0), 6), (datetime.datetime(2017, 12, 1, 0, 0), 4), (datetime.datetime(2017, 12, 1, 0, 0), -12), (datetime.datetime(2017, 12, 1, 0, 0), 2), (datetime.datetime(2017, 12, 1, 0, 0), 3), (datetime.datetime(2017, 12, 1, 0, 0), 5), (datetime.datetime(2017, 12, 1, 0, 0), 52), (datetime.datetime(2017, 12, 1, 0, 0), 16), (datetime.datetime(2018, 1, 1, 0, 0), -4), (datetime.datetime(2018, 1, 1, 0, 0), 4), (datetime.datetime(2018, 1, 1, 0, 0), 12), (datetime.datetime(2018, 1, 1, 0, 0), -23), (datetime.datetime(2018, 1, 1, 0, 0), 16), (datetime.datetime(2018, 1, 1, 0, 0), 48), (datetime.datetime(2018, 1, 1, 0, 0), 5), (datetime.datetime(2018, 1, 1, 0, 0), -1), (datetime.datetime(2018, 1, 1, 0, 0), 1), (datetime.datetime(2018, 1, 1, 0, 0), 3), (datetime.datetime(2018, 1, 1, 0, 0), 17), (datetime.datetime(2018, 1, 1, 0, 0), -7), (datetime.datetime(2018, 1, 1, 0, 0), 11), (datetime.datetime(2018, 1, 1, 0, 0), -6), (datetime.datetime(2018, 1, 1, 0, 0), 7), (datetime.datetime(2018, 1, 1, 0, 0), 10), (datetime.datetime(2018, 1, 1, 0, 0), 8), (datetime.datetime(2018, 1, 1, 0, 0), -13), (datetime.datetime(2018, 1, 1, 0, 0), -9), (datetime.datetime(2018, 1, 1, 0, 0), -3), (datetime.datetime(2018, 1, 1, 0, 0), -2), (datetime.datetime(2018, 1, 1, 0, 0), 32), (datetime.datetime(2018, 1, 1, 0, 0), 6), (datetime.datetime(2018, 1, 1, 0, 0), 2), (datetime.datetime(2018, 2, 1, 0, 0), -7), (datetime.datetime(2018, 2, 1, 0, 0), 12), (datetime.datetime(2018, 2, 1, 0, 0), 32), (datetime.datetime(2018, 2, 1, 0, 0), 3), (datetime.datetime(2018, 2, 1, 0, 0), 11), (datetime.datetime(2018, 2, 1, 0, 0), 1), (datetime.datetime(2018, 2, 1, 0, 0), -3), (datetime.datetime(2018, 2, 1, 0, 0), -2), (datetime.datetime(2018, 2, 1, 0, 0), -1), (datetime.datetime(2018, 2, 1, 0, 0), -4), (datetime.datetime(2018, 2, 1, 0, 0), 48), (datetime.datetime(2018, 2, 1, 0, 0), 4), (datetime.datetime(2018, 2, 1, 0, 0), 16), (datetime.datetime(2018, 2, 1, 0, 0), 24), (datetime.datetime(2018, 2, 1, 0, 0), -5), (datetime.datetime(2018, 2, 1, 0, 0), 72), (datetime.datetime(2018, 2, 1, 0, 0), 2), (datetime.datetime(2018, 2, 1, 0, 0), 6), (datetime.datetime(2018, 3, 1, 0, 0), -3), (datetime.datetime(2018, 3, 1, 0, 0), 8), (datetime.datetime(2018, 3, 1, 0, 0), 24), (datetime.datetime(2018, 3, 1, 0, 0), 3), (datetime.datetime(2018, 3, 1, 0, 0), 16), (datetime.datetime(2018, 3, 1, 0, 0), 150), (datetime.datetime(2018, 3, 1, 0, 0), -23), (datetime.datetime(2018, 3, 1, 0, 0), -2), (datetime.datetime(2018, 3, 1, 0, 0), 27), (datetime.datetime(2018, 3, 1, 0, 0), -9), (datetime.datetime(2018, 3, 1, 0, 0), -5), (datetime.datetime(2018, 3, 1, 0, 0), 14), (datetime.datetime(2018, 3, 1, 0, 0), 15), (datetime.datetime(2018, 3, 1, 0, 0), 48), (datetime.datetime(2018, 3, 1, 0, 0), 4), (datetime.datetime(2018, 3, 1, 0, 0), 13), (datetime.datetime(2018, 3, 1, 0, 0), 7), (datetime.datetime(2018, 3, 1, 0, 0), -7), (datetime.datetime(2018, 3, 1, 0, 0), -6), (datetime.datetime(2018, 3, 1, 0, 0), 20), (datetime.datetime(2018, 3, 1, 0, 0), 6), (datetime.datetime(2018, 3, 1, 0, 0), 10), (datetime.datetime(2018, 3, 1, 0, 0), 12), (datetime.datetime(2018, 3, 1, 0, 0), 1), (datetime.datetime(2018, 3, 1, 0, 0), 32), (datetime.datetime(2018, 3, 1, 0, 0), -1), (datetime.datetime(2018, 3, 1, 0, 0), 2), (datetime.datetime(2018, 3, 1, 0, 0), -48), (datetime.datetime(2018, 3, 1, 0, 0), -8), (datetime.datetime(2018, 3, 1, 0, 0), 5), (datetime.datetime(2018, 3, 1, 0, 0), -10), (datetime.datetime(2018, 3, 1, 0, 0), 17), (datetime.datetime(2018, 4, 1, 0, 0), 36), (datetime.datetime(2018, 4, 1, 0, 0), 4), (datetime.datetime(2018, 4, 1, 0, 0), 11), (datetime.datetime(2018, 4, 1, 0, 0), 60), (datetime.datetime(2018, 4, 1, 0, 0), 2), (datetime.datetime(2018, 4, 1, 0, 0), -3), (datetime.datetime(2018, 4, 1, 0, 0), -2), (datetime.datetime(2018, 4, 1, 0, 0), -8), (datetime.datetime(2018, 4, 1, 0, 0), 6), (datetime.datetime(2018, 4, 1, 0, 0), 8), (datetime.datetime(2018, 4, 1, 0, 0), 1), (datetime.datetime(2018, 4, 1, 0, 0), 22), (datetime.datetime(2018, 4, 1, 0, 0), -11), (datetime.datetime(2018, 4, 1, 0, 0), 150), (datetime.datetime(2018, 4, 1, 0, 0), -1), (datetime.datetime(2018, 4, 1, 0, 0), 5), (datetime.datetime(2018, 4, 1, 0, 0), 3), (datetime.datetime(2018, 4, 1, 0, 0), 7), (datetime.datetime(2018, 4, 1, 0, 0), 10), (datetime.datetime(2018, 4, 1, 0, 0), 32), (datetime.datetime(2018, 4, 1, 0, 0), 14), (datetime.datetime(2018, 4, 1, 0, 0), 16), (datetime.datetime(2018, 4, 1, 0, 0), 48), (datetime.datetime(2018, 4, 1, 0, 0), 12), (datetime.datetime(2018, 4, 1, 0, 0), 24), (datetime.datetime(2018, 5, 1, 0, 0), -1), (datetime.datetime(2018, 5, 1, 0, 0), 20), (datetime.datetime(2018, 5, 1, 0, 0), 16), (datetime.datetime(2018, 5, 1, 0, 0), 32), (datetime.datetime(2018, 5, 1, 0, 0), 5), (datetime.datetime(2018, 5, 1, 0, 0), 6), (datetime.datetime(2018, 5, 1, 0, 0), 120), (datetime.datetime(2018, 5, 1, 0, 0), 3), (datetime.datetime(2018, 5, 1, 0, 0), 8), (datetime.datetime(2018, 5, 1, 0, 0), -3), (datetime.datetime(2018, 5, 1, 0, 0), 36), (datetime.datetime(2018, 5, 1, 0, 0), -2), (datetime.datetime(2018, 5, 1, 0, 0), 24), (datetime.datetime(2018, 5, 1, 0, 0), 4), (datetime.datetime(2018, 5, 1, 0, 0), 1), (datetime.datetime(2018, 5, 1, 0, 0), 2), (datetime.datetime(2018, 5, 1, 0, 0), 10), (datetime.datetime(2018, 5, 1, 0, 0), -14), (datetime.datetime(2018, 5, 1, 0, 0), 14), (datetime.datetime(2018, 5, 1, 0, 0), 12), (datetime.datetime(2018, 5, 1, 0, 0), -9), (datetime.datetime(2018, 6, 1, 0, 0), 3), (datetime.datetime(2018, 6, 1, 0, 0), -1), (datetime.datetime(2018, 6, 1, 0, 0), 39), (datetime.datetime(2018, 6, 1, 0, 0), 5), (datetime.datetime(2018, 6, 1, 0, 0), 17), (datetime.datetime(2018, 6, 1, 0, 0), 11), (datetime.datetime(2018, 6, 1, 0, 0), 16), (datetime.datetime(2018, 6, 1, 0, 0), 10), (datetime.datetime(2018, 6, 1, 0, 0), 2), (datetime.datetime(2018, 6, 1, 0, 0), -4), (datetime.datetime(2018, 6, 1, 0, 0), 4), (datetime.datetime(2018, 6, 1, 0, 0), 32), (datetime.datetime(2018, 6, 1, 0, 0), 7), (datetime.datetime(2018, 6, 1, 0, 0), 120), (datetime.datetime(2018, 6, 1, 0, 0), 1), (datetime.datetime(2018, 6, 1, 0, 0), 12), (datetime.datetime(2018, 6, 1, 0, 0), -2), (datetime.datetime(2018, 6, 1, 0, 0), 6), (datetime.datetime(2018, 7, 1, 0, 0), -6), (datetime.datetime(2018, 7, 1, 0, 0), 7), (datetime.datetime(2018, 7, 1, 0, 0), 72), (datetime.datetime(2018, 7, 1, 0, 0), 6), (datetime.datetime(2018, 7, 1, 0, 0), 192), (datetime.datetime(2018, 7, 1, 0, 0), 10), (datetime.datetime(2018, 7, 1, 0, 0), 12), (datetime.datetime(2018, 7, 1, 0, 0), 32), (datetime.datetime(2018, 7, 1, 0, 0), 112), (datetime.datetime(2018, 7, 1, 0, 0), 3), (datetime.datetime(2018, 7, 1, 0, 0), -2), (datetime.datetime(2018, 7, 1, 0, 0), 5), (datetime.datetime(2018, 7, 1, 0, 0), 13), (datetime.datetime(2018, 7, 1, 0, 0), 22), (datetime.datetime(2018, 7, 1, 0, 0), -1), (datetime.datetime(2018, 7, 1, 0, 0), 1), (datetime.datetime(2018, 7, 1, 0, 0), 4), (datetime.datetime(2018, 7, 1, 0, 0), 15), (datetime.datetime(2018, 7, 1, 0, 0), 16), (datetime.datetime(2018, 7, 1, 0, 0), 8), (datetime.datetime(2018, 7, 1, 0, 0), 2), (datetime.datetime(2018, 8, 1, 0, 0), 7), (datetime.datetime(2018, 8, 1, 0, 0), 30), (datetime.datetime(2018, 8, 1, 0, 0), 20), (datetime.datetime(2018, 8, 1, 0, 0), 2), (datetime.datetime(2018, 8, 1, 0, 0), 6), (datetime.datetime(2018, 8, 1, 0, 0), 8), (datetime.datetime(2018, 8, 1, 0, 0), -3), (datetime.datetime(2018, 8, 1, 0, 0), 16), (datetime.datetime(2018, 8, 1, 0, 0), 9), (datetime.datetime(2018, 8, 1, 0, 0), 5), (datetime.datetime(2018, 8, 1, 0, 0), -2), (datetime.datetime(2018, 8, 1, 0, 0), -150), (datetime.datetime(2018, 8, 1, 0, 0), 1), (datetime.datetime(2018, 8, 1, 0, 0), -1), (datetime.datetime(2018, 8, 1, 0, 0), 11), (datetime.datetime(2018, 8, 1, 0, 0), 3), (datetime.datetime(2018, 8, 1, 0, 0), 64), (datetime.datetime(2018, 8, 1, 0, 0), 10), (datetime.datetime(2018, 8, 1, 0, 0), 12), (datetime.datetime(2018, 8, 1, 0, 0), 32), (datetime.datetime(2018, 8, 1, 0, 0), 4), (datetime.datetime(2018, 9, 1, 0, 0), 2), (datetime.datetime(2018, 9, 1, 0, 0), 40), (datetime.datetime(2018, 9, 1, 0, 0), 16), (datetime.datetime(2018, 9, 1, 0, 0), -3), (datetime.datetime(2018, 9, 1, 0, 0), 5), (datetime.datetime(2018, 9, 1, 0, 0), 4), (datetime.datetime(2018, 9, 1, 0, 0), 1), (datetime.datetime(2018, 9, 1, 0, 0), -7), (datetime.datetime(2018, 9, 1, 0, 0), 3), (datetime.datetime(2018, 9, 1, 0, 0), 6), (datetime.datetime(2018, 9, 1, 0, 0), -2), (datetime.datetime(2018, 9, 1, 0, 0), -1), (datetime.datetime(2018, 9, 1, 0, 0), 32), (datetime.datetime(2018, 10, 1, 0, 0), 2), (datetime.datetime(2018, 10, 1, 0, 0), 8), (datetime.datetime(2018, 10, 1, 0, 0), 17), (datetime.datetime(2018, 10, 1, 0, 0), 3), (datetime.datetime(2018, 10, 1, 0, 0), 5), (datetime.datetime(2018, 10, 1, 0, 0), 9), (datetime.datetime(2018, 10, 1, 0, 0), 120), (datetime.datetime(2018, 10, 1, 0, 0), -1), (datetime.datetime(2018, 10, 1, 0, 0), 6), (datetime.datetime(2018, 10, 1, 0, 0), -6), (datetime.datetime(2018, 10, 1, 0, 0), 40), (datetime.datetime(2018, 10, 1, 0, 0), 16), (datetime.datetime(2018, 10, 1, 0, 0), 20), (datetime.datetime(2018, 10, 1, 0, 0), -3), (datetime.datetime(2018, 10, 1, 0, 0), 1), (datetime.datetime(2018, 10, 1, 0, 0), 4), (datetime.datetime(2018, 10, 1, 0, 0), 32), (datetime.datetime(2018, 10, 1, 0, 0), 7), (datetime.datetime(2018, 11, 1, 0, 0), 48), (datetime.datetime(2018, 11, 1, 0, 0), 4), (datetime.datetime(2018, 11, 1, 0, 0), 16), (datetime.datetime(2018, 11, 1, 0, 0), 80), (datetime.datetime(2018, 11, 1, 0, 0), 32), (datetime.datetime(2018, 11, 1, 0, 0), 12), (datetime.datetime(2018, 11, 1, 0, 0), 10), (datetime.datetime(2018, 11, 1, 0, 0), 5), (datetime.datetime(2018, 11, 1, 0, 0), -24), (datetime.datetime(2018, 11, 1, 0, 0), 6), (datetime.datetime(2018, 11, 1, 0, 0), 72), (datetime.datetime(2018, 11, 1, 0, 0), 2), (datetime.datetime(2018, 11, 1, 0, 0), -3), (datetime.datetime(2018, 11, 1, 0, 0), 13), (datetime.datetime(2018, 11, 1, 0, 0), -12), (datetime.datetime(2018, 11, 1, 0, 0), 3), (datetime.datetime(2018, 11, 1, 0, 0), 17), (datetime.datetime(2018, 11, 1, 0, 0), -1), (datetime.datetime(2018, 11, 1, 0, 0), 1), (datetime.datetime(2018, 11, 1, 0, 0), -5), (datetime.datetime(2018, 12, 1, 0, 0), -6), (datetime.datetime(2018, 12, 1, 0, 0), 5), (datetime.datetime(2018, 12, 1, 0, 0), 3), (datetime.datetime(2018, 12, 1, 0, 0), 12), (datetime.datetime(2018, 12, 1, 0, 0), 16), (datetime.datetime(2018, 12, 1, 0, 0), 8), (datetime.datetime(2018, 12, 1, 0, 0), 4), (datetime.datetime(2018, 12, 1, 0, 0), 128), (datetime.datetime(2018, 12, 1, 0, 0), 10), (datetime.datetime(2018, 12, 1, 0, 0), 6), (datetime.datetime(2018, 12, 1, 0, 0), 2), (datetime.datetime(2018, 12, 1, 0, 0), -1), (datetime.datetime(2018, 12, 1, 0, 0), 13), (datetime.datetime(2018, 12, 1, 0, 0), 1)], [(datetime.datetime(2017, 12, 1, 0, 0), 12.72), (datetime.datetime(2017, 12, 1, 0, 0), 25.49), (datetime.datetime(2017, 12, 1, 0, 0), 20.38), (datetime.datetime(2017, 12, 1, 0, 0), 10.95), (datetime.datetime(2017, 12, 1, 0, 0), 9.95), (datetime.datetime(2017, 12, 1, 0, 0), 12.75), (datetime.datetime(2017, 12, 1, 0, 0), 8.5), (datetime.datetime(2018, 1, 1, 0, 0), 8.5), (datetime.datetime(2018, 1, 1, 0, 0), 25.49), (datetime.datetime(2018, 1, 1, 0, 0), 12.75), (datetime.datetime(2018, 1, 1, 0, 0), 24.96), (datetime.datetime(2018, 1, 1, 0, 0), 9.95), (datetime.datetime(2018, 1, 1, 0, 0), 10.95), (datetime.datetime(2018, 1, 1, 0, 0), 19.96), (datetime.datetime(2018, 1, 1, 0, 0), 0.0), (datetime.datetime(2018, 2, 1, 0, 0), 12.75), (datetime.datetime(2018, 2, 1, 0, 0), 24.96), (datetime.datetime(2018, 2, 1, 0, 0), 10.95), (datetime.datetime(2018, 2, 1, 0, 0), 8.5), (datetime.datetime(2018, 2, 1, 0, 0), 19.96), (datetime.datetime(2018, 2, 1, 0, 0), 9.95), (datetime.datetime(2018, 3, 1, 0, 0), 24.96), (datetime.datetime(2018, 3, 1, 0, 0), 9.95), (datetime.datetime(2018, 3, 1, 0, 0), 10.95), (datetime.datetime(2018, 3, 1, 0, 0), 9.86), (datetime.datetime(2018, 3, 1, 0, 0), 4.0), (datetime.datetime(2018, 3, 1, 0, 0), 12.75), (datetime.datetime(2018, 3, 1, 0, 0), 19.96), (datetime.datetime(2018, 3, 1, 0, 0), 8.5), (datetime.datetime(2018, 4, 1, 0, 0), 19.96), (datetime.datetime(2018, 4, 1, 0, 0), 8.5), (datetime.datetime(2018, 4, 1, 0, 0), 9.95), (datetime.datetime(2018, 4, 1, 0, 0), 12.75), (datetime.datetime(2018, 4, 1, 0, 0), 24.96), (datetime.datetime(2018, 4, 1, 0, 0), 10.95), (datetime.datetime(2018, 5, 1, 0, 0), 24.96), (datetime.datetime(2018, 5, 1, 0, 0), 19.96), (datetime.datetime(2018, 5, 1, 0, 0), 9.95), (datetime.datetime(2018, 5, 1, 0, 0), 12.75), (datetime.datetime(2018, 5, 1, 0, 0), 10.95), (datetime.datetime(2018, 5, 1, 0, 0), 5.0), (datetime.datetime(2018, 6, 1, 0, 0), 12.75), (datetime.datetime(2018, 6, 1, 0, 0), 4.0), (datetime.datetime(2018, 6, 1, 0, 0), 8.5), (datetime.datetime(2018, 6, 1, 0, 0), 10.95), (datetime.datetime(2018, 6, 1, 0, 0), 19.96), (datetime.datetime(2018, 6, 1, 0, 0), 9.95), (datetime.datetime(2018, 6, 1, 0, 0), 19.95), (datetime.datetime(2018, 6, 1, 0, 0), 24.96), (datetime.datetime(2018, 7, 1, 0, 0), 19.96), (datetime.datetime(2018, 7, 1, 0, 0), 8.5), (datetime.datetime(2018, 7, 1, 0, 0), 24.96), (datetime.datetime(2018, 7, 1, 0, 0), 10.95), (datetime.datetime(2018, 7, 1, 0, 0), 9.95), (datetime.datetime(2018, 7, 1, 0, 0), 12.75), (datetime.datetime(2018, 8, 1, 0, 0), 10.95), (datetime.datetime(2018, 8, 1, 0, 0), 24.96), (datetime.datetime(2018, 8, 1, 0, 0), 19.96), (datetime.datetime(2018, 8, 1, 0, 0), 9.95), (datetime.datetime(2018, 8, 1, 0, 0), 8.5), (datetime.datetime(2018, 8, 1, 0, 0), 12.75), (datetime.datetime(2018, 9, 1, 0, 0), 10.95), (datetime.datetime(2018, 9, 1, 0, 0), 24.96), (datetime.datetime(2018, 9, 1, 0, 0), 9.95), (datetime.datetime(2018, 9, 1, 0, 0), 12.75), (datetime.datetime(2018, 10, 1, 0, 0), 12.75), (datetime.datetime(2018, 10, 1, 0, 0), 24.96), (datetime.datetime(2018, 10, 1, 0, 0), 9.95), (datetime.datetime(2018, 10, 1, 0, 0), 10.95), (datetime.datetime(2018, 11, 1, 0, 0), 12.75), (datetime.datetime(2018, 11, 1, 0, 0), 32.04), (datetime.datetime(2018, 11, 1, 0, 0), 24.96), (datetime.datetime(2018, 11, 1, 0, 0), 10.95), (datetime.datetime(2018, 12, 1, 0, 0), 32.04), (datetime.datetime(2018, 12, 1, 0, 0), 12.75), (datetime.datetime(2018, 12, 1, 0, 0), 10.95), (datetime.datetime(2018, 12, 1, 0, 0), 24.96)]]

我执行了相同的操作,将数据加载到熊猫中,从两个数据框中的每个提取数据列,并找到它们之间的相关性。

现在没有聚集,我得到的pearson_coeff值为0.0189,而spearman_coeff的值为0.0395

但是,对于我来说,价值观实际上急剧下降真的让我感到很奇怪。例如,Pearson coefficient的值从0.34下降到0.01,而Spearman coefficent的值从0.28下降到0.03

我不确定为什么会有如此急剧的下降。如果看一下图表,这两个指标似乎确实在某种程度上彼此融洽,我期望相关性的值会更大。

我怎么知道该选择哪个来确定相关性? aggregated指标之间的相关性还是non aggregated指标之间的相关性?如何验证我得到的结果是否有效?

0 个答案:

没有答案