Question

我有一个数据框，其中有500列按日期索引，有四年的数据。

|日期| A | AAL | AAP | AAPL | ABC ......

| 1/2/2004 | 18.442521 | 25.954398 | 1.38449 | 11.528444 ......

| 1/5/2004 | 18.922795 | 25.718507 | 1.442394 | 11.919131 ...

| 1/6/2004 | 19.518334 | 26.177538 | 1.437189 | 11.870028 ....

。。。等...

我想计算每天的Pearson相关矩阵，所以每行。我希望按照R可读的最节省空间的方式按日期保存矩阵。（现在我的目标是单独的工作表，按索引日期，在Excel中。我愿意接受建议。）

我尝试了几种方法，但这似乎是最有希望的，因为我无法将corr（）应用于df.groupby。

但是这个方法返回空的数据帧，现在我卡住了！我正在寻找一种不涉及迭代的方法。

def do_Corr(df_group):
"""Apply the function to each group in the data and return one result."""
X = df_group.corr()
return X

df.groupby([df.index.year,df.index.month,df.index.day]).apply(do_Corr).dropna()

Answer 1

你可能想要df.T.corr()。 .T转置数据框，因此行成为列，然后您可以应用.corr()方法。

通过Index Python计算每个Vector Row的相关数据帧

1 个答案: