如何在sklearn python中拟合GaussianMixture时处理内存错误?

时间:2017-09-25 06:29:02

标签: python memory scikit-learn gmm

我正在尝试使用sklearn将GaussianMixture与一堆猫狗图片相匹配。我提供了一个大小不一的数组(50,30000),其中50个数据点(25只猫和25只狗图片),30000是我将每张图片转换为numpy数组并调整为(100,100,3)后的功能数量。它抛出了内存错误。在运行此代码之前,我有4GB的RAM和70%的使用率。任何人都可以建议我如何调试sklearn中GaussianMixture fit方法使用的内存量。或者任何人都可以提供一些代码来批量生产。

以下是代码

print(img_coll_cat_dog.shape)
print(img_coll_cat_dog.nbytes)
print(img_coll_cat_dog.itemsize)

结果:

(50, 30000)
12000000 bytes
8 

gmix = mixture.GaussianMixture(n_components=2, covariance_type='full')
gmix.fit(img_coll_cat_dog)

以下是我得到的错误。

MemoryError                               Traceback (most recent call last)
<ipython-input-32-c0370476a619> in <module>()
      1 gmix = mixture.GaussianMixture(n_components=2, covariance_type='full')
----> 2 gmix.fit(img_coll_cat_dog)

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/base.py in fit(self, X, y)
    205 
    206             if do_init:
--> 207                 self._initialize_parameters(X, random_state)
    208                 self.lower_bound_ = -np.infty
    209 

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/base.py in _initialize_parameters(self, X, random_state)
    155                              % self.init_params)
    156 
--> 157         self._initialize(X, resp)
    158 
    159     @abstractmethod

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/gaussian_mixture.py in _initialize(self, X, resp)
    629 
    630         weights, means, covariances = _estimate_gaussian_parameters(
--> 631             X, resp, self.reg_covar, self.covariance_type)
    632         weights /= n_samples
    633 

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/gaussian_mixture.py in _estimate_gaussian_parameters(X, resp, reg_covar, covariance_type)
    283                    "diag": _estimate_gaussian_covariances_diag,
    284                    "spherical": _estimate_gaussian_covariances_spherical
--> 285                    }[covariance_type](resp, X, nk, means, reg_covar)
    286     return nk, means, covariances
    287 

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/gaussian_mixture.py in _estimate_gaussian_covariances_full(resp, X, nk, means, reg_covar)
    162     """
    163     n_components, n_features = means.shape
--> 164     covariances = np.empty((n_components, n_features, n_features))
    165     for k in range(n_components):
    166         diff = X - means[k]

MemoryError: 

非常感谢任何帮助。

1 个答案:

答案 0 :(得分:2)

尝试设置covariance_type ='diag'