我无法弄清楚为什么在使用np.cov时我得到一个不同的协方差矩阵:
import numpy as np
from tabulate import tabulate
np.random.seed(1)
a = np.random.random((10,3))
b = a - np.mean(a,axis=0,keepdims=True)
print "np.cov(a.T):"
print tabulate(np.cov(a.T),floatfmt='.2f',tablefmt='grid')
print "\nnp.dot(b.T,b):"
print tabulate(np.dot(b.T,b),floatfmt='.2f',tablefmt='grid')
输出:
np.cov(a.T):
+-------+-------+-------+
| 0.10 | 0.03 | -0.01 |
+-------+-------+-------+
| 0.03 | 0.08 | -0.07 |
+-------+-------+-------+
| -0.01 | -0.07 | 0.12 |
+-------+-------+-------+
np.dot(b.T,b):
+-------+-------+-------+
| 0.92 | 0.26 | -0.09 |
+-------+-------+-------+
| 0.26 | 0.72 | -0.60 |
+-------+-------+-------+
| -0.09 | -0.60 | 1.07 |
+-------+-------+-------+
知道结果有何不同? numpy用于协方差计算的公式是什么?
更新: 我没有对上面的协方差方程进行归一化,使用(N-1)正如评论中所建议的那样修正了这个问题。 但是,我注意到在我的实例中,使用np.cov我的结果略有不同。好像如果我使用np.savetxt和np.save保存我的输入数据,我会得到不同的结果:
import numpy as np
from tabulate import tabulate
a1 = np.load('d1.npy')[:,0:3]
print "np.loadtxt:"
print tabulate(a1,floatfmt='.20f',tablefmt='grid')
a2 = np.loadtxt('/d1.txt')[:,0:3]
print "\nload:"
print tabulate(a2,floatfmt='.20f',tablefmt='grid')
b1 = a1 - np.mean(a1,axis=0,keepdims=True)
b2 = a2 - np.mean(a2,axis=0,keepdims=True)
print "\nnp.cov(a1.T):"
print tabulate(np.cov(a1.T),floatfmt='.14f',tablefmt='grid')
print "\nnp.dot(b1.T,b):"
print tabulate(np.dot(b1.T,b1)/float(len(b1)-1),floatfmt='.14f',tablefmt='grid')
print "\nnp.cov(a2.T):"
print tabulate(np.cov(a2.T),floatfmt='.14f',tablefmt='grid')
print "\nnp.dot(b2.T,b):"
print tabulate(np.dot(b2.T,b2)/float(len(b2)-1),floatfmt='.14f',tablefmt='grid')
输出:
np.loadtxt:
+------------------------ +------------------------ +------------------------+
| 0.12400000542402267456 | 0.29300001263618469238 | 0.08000000566244125366 |
+------------------------ +------------------------ +------------------------+
| 0.12300000339746475220 | 0.28900000452995300293 | 0.06900000572204589844 |
+------------------------ +------------------------ +------------------------+
| 0.13800001144409179688 | 0.28100001811981201172 | 0.07400000095367431641 |
+------------------------ +------------------------ +------------------------+
| 0.13300000131130218506 | 0.26200002431869506836 | 0.07700000703334808350 |
+------------------------ +------------------------ +------------------------+
| 0.13800001144409179688 | 0.27900001406669616699 | 0.06700000166893005371 |
+------------------------ +------------------------ +------------------------+
| 0.12900000810623168945 | 0.26100000739097595215 | 0.07700000703334808350 |
+------------------------ +------------------------ +------------------------+
| 0.12700000405311584473 | 0.25600001215934753418 | 0.07700000703334808350 |
+------------------------ +------------------------ +------------------------+
| 0.12700000405311584473 | 0.26200002431869506836 | 0.08200000226497650146 |
+------------------------ +------------------------ +------------------------+
| 0.12400000542402267456 | 0.28000000119209289551 | 0.07300000637769699097 |
+------------------------ +------------------------ +------------------------+
| 0.12600000202655792236 | 0.30200001597404479980 | 0.06599999964237213135 |
+------------------------ +------------------------ +------------------------+
np.load:
+------------------------ +------------------------ +------------------------+
| 0.12400000542402267456 | 0.29300001263618469238 | 0.08000000566244125366 |
+------------------------ +------------------------ +------------------------+
| 0.12300000339746475220 | 0.28900000452995300293 | 0.06900000572204589844 |
+------------------------ +------------------------ +------------------------+
| 0.13800001144409179688 | 0.28100001811981201172 | 0.07400000095367431641 |
+------------------------ +------------------------ +------------------------+
| 0.13300000131130218506 | 0.26200002431869506836 | 0.07700000703334808350 |
+------------------------ +------------------------ +------------------------+
| 0.13800001144409179688 | 0.27900001406669616699 | 0.06700000166893005371 |
+------------------------ +------------------------ +------------------------+
| 0.12900000810623168945 | 0.26100000739097595215 | 0.07700000703334808350 |
+------------------------ +------------------------ +------------------------+
| 0.12700000405311584473 | 0.25600001215934753418 | 0.07700000703334808350 |
+------------------------ +------------------------ +------------------------+
| 0.12700000405311584473 | 0.26200002431869506836 | 0.08200000226497650146 |
+------------------------ +------------------------ +------------------------+
| 0.12400000542402267456 | 0.28000000119209289551 | 0.07300000637769699097 |
+------------------------ +------------------------ +------------------------+
| 0.12600000202655792236 | 0.30200001597404479980 | 0.06599999964237213135 |
+------------------------ +------------------------ +------------------------+
np.cov(a1.T):
+------------------- +------------------- +-------------------+
| 0.00003121113778 | -0.00001961109117 | -0.00000486667563 |
+------------------- +------------------- +-------------------+
| -0.00001961109117 | 0.00024427771650 | -0.00005066667516 |
+------------------- +------------------- +-------------------+
| -0.00000486667563 | -0.00005066667516 | 0.00002951112509 |
+------------------- +------------------- +-------------------+
np.dot(b1.T,b):
+------------------- +------------------- +-------------------+
| 0.00003121113696 | -0.00001961109228 | -0.00000486667523 |
+------------------- +------------------- +-------------------+
| -0.00001961109228 | 0.00024427770404 | -0.00005066667654 |
+------------------- +------------------- +-------------------+
| -0.00000486667523 | -0.00005066667654 | 0.00002951112219 |
+------------------- +------------------- +-------------------+
np.cov(a2.T):
+------------------- +------------------- +-------------------+
| 0.00003121113778 | -0.00001961109117 | -0.00000486667563 |
+------------------- +------------------- +-------------------+
| -0.00001961109117 | 0.00024427771650 | -0.00005066667516 |
+------------------- +------------------- +-------------------+
| -0.00000486667563 | -0.00005066667516 | 0.00002951112509 |
+------------------- +------------------- +-------------------+
np.dot(b2.T,b):
+------------------- +------------------- +-------------------+
| 0.00003121113778 | -0.00001961109117 | -0.00000486667563 |
+------------------- +------------------- +-------------------+
| -0.00001961109117 | 0.00024427771650 | -0.00005066667516 |
+------------------- +------------------- +-------------------+
| -0.00000486667563 | -0.00005066667516 | 0.00002951112509 |
正如您所看到的,np.dot(b1.T,b)
的结果略有不同。