我一直在尝试使用pandas DataFrame生成基于corr矩阵的spearman。尽管我在DataFrame对象中插入了200+ nd.array,但所有结果都使我获得了190X190矩阵。
import pandas as pd
vectors # list of 200 nd.array with the same size
df = pd.DataFrame(vectors)
mat = df.corr(method="spearman")
print (len(mat))
这行应该打印190,而我希望看到200,corr函数是否最多支持190个元素?
答案 0 :(得分:0)
pandas.DataFrame.corr
方法计算列上的相关性,也许向量的长度为190,这就是为什么看到结果的原因。
例如:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.rand(200, 10))
df.head()
0 1 2 ... 7 8 9
0 0.635328 0.801989 0.928792 ... 0.409299 0.500572 0.096176
1 0.849282 0.947163 0.474262 ... 0.693389 0.260647 0.021113
2 0.057634 0.032594 0.576879 ... 0.954159 0.106100 0.265630
3 0.031656 0.114908 0.902100 ... 0.541662 0.560113 0.148045
4 0.904117 0.878109 0.982493 ... 0.416373 0.038374 0.329338
[5 rows x 10 columns]
df.corr()
0 1 2 ... 7 8 9
0 1.000000 0.158221 -0.027012 ... -0.060187 -0.146682 -0.046548
1 0.158221 1.000000 -0.059599 ... 0.038815 -0.052355 -0.037781
2 -0.027012 -0.059599 1.000000 ... 0.075159 0.015261 0.035054
3 -0.047710 -0.002200 -0.122131 ... -0.060431 0.017568 0.000935
4 -0.029542 -0.036819 0.108064 ... -0.002548 -0.097453 -0.047605
5 -0.007827 0.026060 0.078727 ... 0.009858 0.076280 0.031792
6 0.107026 0.109253 -0.094501 ... -0.069312 -0.045408 -0.103385
7 -0.060187 0.038815 0.075159 ... 1.000000 0.036394 0.028998
8 -0.146682 -0.052355 0.015261 ... 0.036394 1.000000 0.157691
9 -0.046548 -0.037781 0.035054 ... 0.028998 0.157691 1.000000
[10 rows x 10 columns]
# Note 10 x 10 output as we calculated correlation between columns
df.T.corr()
0 1 2 ... 197 198 199
0 1.000000 0.498877 -0.365548 ... 0.332607 -0.334024 -0.167504
1 0.498877 1.000000 -0.206066 ... -0.202328 0.212138 -0.265470
2 -0.365548 -0.206066 1.000000 ... 0.220807 -0.363419 0.006970
3 0.161274 -0.030202 0.350527 ... 0.242371 -0.124993 -0.419660
4 0.413043 0.360164 0.085686 ... 0.299763 -0.057072 -0.354378
5 -0.178747 0.398373 0.122139 ... -0.364762 0.298731 0.096835
6 0.428633 0.332952 0.206733 ... 0.504151 -0.489508 -0.530162
7 0.031738 0.178856 0.671928 ... 0.140511 -0.077814 -0.450641
8 -0.082514 0.217348 -0.263754 ... 0.037616 0.382671 0.078891
9 0.003956 -0.037507 -0.071405 ... 0.416418 -0.364057 -0.422265
10 0.477660 -0.062398 -0.223941 ... 0.052904 -0.005123 0.232310
11 0.064510 0.322448 0.112256 ... 0.037528 -0.122080 -0.149634
12 -0.097806 0.338672 -0.546019 ... -0.220268 0.306534 0.068144
13 -0.031138 0.213664 -0.397962 ... 0.188932 0.180403 -0.378188
14 0.038888 -0.024346 0.238451 ... 0.425915 -0.181613 -0.562234
15 0.465428 0.260079 -0.572854 ... -0.080576 0.082298 -0.261606
16 -0.535678 -0.257890 0.514151 ... -0.259198 -0.053499 0.268481
17 0.308875 -0.186421 -0.391446 ... 0.360044 -0.158955 0.268429
18 -0.275178 0.155606 -0.347488 ... -0.047829 0.533397 -0.412796
19 0.080700 0.391280 0.418628 ... 0.388697 -0.248641 -0.417406
20 -0.423705 -0.664545 0.227255 ... 0.245131 -0.463002 0.577954
21 0.853115 0.356033 -0.445827 ... 0.471394 -0.404416 -0.273944
22 0.307765 -0.038106 0.022683 ... 0.470156 -0.174377 -0.516724
23 0.618258 0.708506 -0.450709 ... -0.002726 0.290722 -0.410717
24 0.469414 0.757931 -0.550096 ... -0.370019 0.310175 0.131973
25 -0.308645 0.494687 -0.153664 ... -0.533062 0.496228 -0.040558
26 0.242789 0.687153 -0.506575 ... -0.448844 0.358878 0.133408
27 -0.069875 -0.473316 0.138600 ... 0.489610 -0.095459 0.139735
28 0.155511 0.238704 0.102230 ... 0.338030 -0.073523 0.182840
29 -0.112128 -0.138975 -0.060637 ... -0.028259 -0.349001 0.490552
.. ... ... ... ... ... ... ...
170 0.226813 0.031162 -0.205408 ... -0.613621 0.117084 0.357222
171 0.451682 0.491021 -0.474939 ... 0.372531 -0.272856 0.008240
172 0.158155 -0.228592 -0.352837 ... -0.118192 0.223844 0.010163
173 0.450049 0.319076 0.158477 ... 0.245050 -0.308017 -0.241448
174 0.190380 0.228906 0.417048 ... 0.143381 0.104080 -0.405662
175 -0.357893 -0.159158 -0.289425 ... -0.371118 0.579670 -0.413831
176 0.078708 0.165983 0.324744 ... 0.180002 -0.146310 -0.329281
177 -0.450046 -0.180979 -0.414211 ... -0.289672 0.616857 -0.062768
178 0.549715 0.408558 -0.055886 ... -0.170083 -0.294178 0.081205
179 -0.111201 0.472122 -0.075174 ... -0.518823 0.470020 0.047267
180 0.098447 -0.201907 0.119621 ... 0.019750 -0.268528 -0.056763
181 0.234285 0.144266 0.631389 ... 0.329869 -0.395606 0.139690
182 0.394321 0.239157 -0.043077 ... 0.423926 -0.695812 0.273296
183 0.634644 0.097258 -0.171962 ... -0.047281 -0.256005 0.162903
184 0.103204 -0.192399 -0.777966 ... 0.056135 -0.034793 0.034549
185 0.339759 0.239913 0.582344 ... 0.467792 -0.378750 -0.201036
186 -0.548353 -0.769797 0.067698 ... 0.085241 0.141845 0.210096
187 -0.167574 0.437010 0.178703 ... 0.036527 0.076168 -0.510175
188 0.048067 -0.586393 0.489235 ... 0.390122 -0.686588 0.234376
189 0.217725 0.049377 -0.477302 ... 0.089802 0.266892 -0.336807
190 -0.767265 -0.684753 0.213544 ... -0.010338 0.242734 0.254413
191 -0.428090 -0.898531 0.180227 ... 0.286884 -0.255007 0.141773
192 0.163057 -0.110622 -0.067697 ... 0.047667 -0.149770 0.577512
193 -0.174333 -0.612754 0.272226 ... 0.238010 -0.688222 0.725651
194 0.282385 0.153783 -0.218055 ... -0.519361 -0.136575 0.517739
195 0.533939 0.235234 -0.050181 ... 0.186320 -0.342353 -0.430962
196 -0.186149 -0.546681 0.344461 ... 0.130606 -0.057928 -0.451576
197 0.332607 -0.202328 0.220807 ... 1.000000 -0.718022 -0.250786
198 -0.334024 0.212138 -0.363419 ... -0.718022 1.000000 -0.193411
199 -0.167504 -0.265470 0.006970 ... -0.250786 -0.193411 1.000000
[200 rows x 200 columns]
# Transposing the dataframe calculates the correlations for each vector by making them the columns
还值得指出的是,如果您使用scipy.stats.spaermanr
,其行为是相同的。它计算列向量的相关性。