python scipy spearman相关性

时间:2017-11-25 11:02:22

标签: python pandas scipy sklearn-pandas pearson-correlation

我试图从数据帧(df)中获取列名,并将它们与由spearmanr相关函数生成的结果数组相关联。我需要将列名(a-j)与相关值(spearman)和p值(spearman_pvalue)相关联。是否有直观的方式来执行此任务?

import { commonEnvironment } from './environment.common';

export const environment = {
  production: false,
  common: commonEnvironment
};

1 个答案:

答案 0 :(得分:3)

似乎你需要:

from scipy.stats import spearmanr

df=pd.DataFrame(np.random.randint(0,100,size= (100,10)),columns=list('abcdefghij'))
#print (df)

#faster for binary df
df['target'] = (df['a'] >= 50).astype(int)
#print (df)

spearman,spearman_pvalue=spearmanr(df.drop(['target'],axis=1),df.target)

df1 = pd.DataFrame(spearman.reshape(-1, 11), columns=df.columns)
#print (df1)

df2 = pd.DataFrame(spearman_pvalue.reshape(-1, 11), columns=df.columns)
#print (df2)

### Kyle, we can assign the index back to the column names for the total matrix:
df2=df2.set_index(df.columns)
df1=df1.set_index(df.columns)

或者:

df1 = pd.DataFrame(spearman.reshape(-1, 11), 
                  columns=df.columns, 
                  index=df.columns)
df2 = pd.DataFrame(spearman_pvalue.reshape(-1, 11), 
                   columns=df.columns, 
                   index=df.columns)