Scikit Learn:随机Logistic回归给出ValueError:输出数组是只读的

时间:2015-01-02 10:38:34

标签: python numpy scikit-learn

我尝试将随机Logistic回归与我的数据相匹配,但我无法继续。 这是代码:

import numpy as np    
X = np.load("X.npy")
y = np.load("y.npy")

randomized_LR = RandomizedLogisticRegression(C=0.1, verbose=True, n_jobs=3)
randomized_LR.fit(X, y)

这会出错:

    344     if issparse(X):
    345         size = len(weights)
    346         weight_dia = sparse.dia_matrix((1 - weights, 0), (size, size))
    347         X = X * weight_dia
    348     else:
--> 349         X *= (1 - weights)
    350
    351     C = np.atleast_1d(np.asarray(C, dtype=np.float))
    352     scores = np.zeros((X.shape[1], len(C)), dtype=np.bool)
    353

ValueError: output array is read-only

有人可以指出我该怎么做才能继续吗?

非常感谢你

亨德拉

按要求完成回溯:

Traceback (most recent call last):
  File "temp.py", line 88, in <module>
  train_randomized_logistic_regression()
  File "temp.py", line 82, in train_randomized_logistic_regression
randomized_LR.fit(X, y)
  File "/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.py", line 110, in fit
sample_fraction=self.sample_fraction, **params)
File "/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/memory.py", line 281, in __call__
return self.func(*args, **kwargs)
File "/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.py", line 52, in _resample_model
for _ in range(n_resampling)):
File "/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 660, in __call__
self.retrieve()
File "/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 543, in retrieve
raise exception_type(report)
sklearn.externals.joblib.my_exceptions.JoblibValueError: JoblibValueError
___________________________________________________________________________
Multiprocessing exception:
...........................................................................
/zfs/ilps-plexest/homedirs/hbunyam1/social_graph/temp.py in <module>()
     83
     84
     85
     86 if __name__ == '__main__':
     87
---> 88     train_randomized_logistic_regression()
     89
     90
     91
     92

...........................................................................
/zfs/ilps-plexest/homedirs/hbunyam1/social_graph/temp.py in train_randomized_logistic_regression()
     77     X = np.load( 'data/issuemakers/features/new_X.npy')
     78     y = np.load( 'data/issuemakers/features/new_y.npy')
     79
     80     randomized_LR = RandomizedLogisticRegression(C=0.1, n_jobs=32)
     81
---> 82     randomized_LR.fit(X, y)
    randomized_LR.fit = <bound method RandomizedLogisticRegression.fit o...d=0.25,
           tol=0.001, verbose=False)>
    X = array([[  1.01014900e+06,   7.29970000e+04,   2....460000e+04,   3.11428571e+01,   1.88100000e+03]])
    y = array([1, 1, 1, ..., 0, 1, 1])
     83
     84
     85
     86 if __name__ == '__main__':

...........................................................................
/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.py in  fit(self=RandomizedLogisticRegression(C=0.1, fit_intercep...ld=0.25,
           tol=0.001, verbose=False), X=array([[  6.93135506e-04,   8.93676615e-04,    -1....234095e-04,  -1.19037488e-04,   4.20921021e-04]]), y=array([1, 1, 1, ..., 0, 1, 1]))
    105         )(
    106             estimator_func, X, y,
    107             scaling=self.scaling, n_resampling=self.n_resampling,
    108             n_jobs=self.n_jobs, verbose=self.verbose,
    109             pre_dispatch=self.pre_dispatch, random_state=self.random_state,
--> 110             sample_fraction=self.sample_fraction, **params)
    self.sample_fraction = 0.75
    params = {'C': 0.1, 'fit_intercept': True, 'tol': 0.001}
    111
    112         if scores_.ndim == 1:
    113             scores_ = scores_[:, np.newaxis]
    114         self.all_scores_ = scores_

 ...........................................................................
 /home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/memory.py in __call__(self=NotMemorizedFunc(func=<function _resample_model at 0x7fb5d7d12b18>), *args=(<function _randomized_logistic>, array([[  6.93135506e-04,   8.93676615e-04,  -1....234095e-04,  -1.19037488e-04,   4.20921021e-04]]), array([1, 1, 1, ..., 0, 1, 1])), **kwargs={'C': 0.1, 'fit_intercept': True, 'n_jobs': 32, 'n_resampling': 200, 'pre_dispatch': '3*n_jobs', 'random_state': None, 'sample_fraction': 0.75, 'scaling': 0.5, 'tol': 0.001, 'verbose': False})
    276     # Should be a light as possible (for speed)
    277     def __init__(self, func):
    278         self.func = func
    279
    280     def __call__(self, *args, **kwargs):
--> 281         return self.func(*args, **kwargs)
    self.func = <function _resample_model>
    args = (<function _randomized_logistic>, array([[  6.93135506e-04,   8.93676615e-04,  -1....234095e-04,  -1.19037488e-04,   4.20921021e-04]]), array([1, 1, 1, ..., 0, 1, 1]))
    kwargs = {'C': 0.1, 'fit_intercept': True, 'n_jobs': 32, 'n_resampling': 200, 'pre_dispatch': '3*n_jobs', 'random_state': None, 'sample_fraction': 0.75, 'scaling': 0.5, 'tol': 0.001, 'verbose': False}
282
283     def call_and_shelve(self, *args, **kwargs):
284         return NotMemorizedResult(self.func(*args, **kwargs))
285

...........................................................................
/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.py in _resample_model(estimator_func=<function _randomized_logistic>, X=array([[  6.93135506e-04,   8.93676615e-04,  -1....234095e-04,  -1.19037488e-04,   4.20921021e-04]]), y=array([1, 1, 1, ..., 0, 1, 1]), scaling=0.5, n_resampling=200, n_jobs=32, verbose=False, pre_dispatch='3*n_jobs', random_state=<mtrand.RandomState object>, sample_fraction=0.75, **params={'C': 0.1, 'fit_intercept': True, 'tol': 0.001})
     47                 X, y, weights=scaling * random_state.random_integers(
     48                     0, 1, size=(n_features,)),
     49                 mask=(random_state.rand(n_samples) < sample_fraction),
     50                 verbose=max(0, verbose - 1),
     51                 **params)
---> 52             for _ in range(n_resampling)):
    n_resampling = 200
     53         scores_ += active_set
     54
     55     scores_ /= n_resampling
     56     return scores_

 ...........................................................................
 /home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py in __call__(self=Parallel(n_jobs=32), iterable=<itertools.islice object>)
    655             if pre_dispatch == "all" or n_jobs == 1:
    656                 # The iterable was consumed all at once by the above for loop.
    657                 # No need to wait for async callbacks to trigger to
    658                 # consumption.
    659                 self._iterating = False
--> 660             self.retrieve()
    self.retrieve = <bound method Parallel.retrieve of Parallel(n_jobs=32)>
    661             # Make sure that we get a last message telling us we are done
    662             elapsed_time = time.time() - self._start_time
    663             self._print('Done %3i out of %3i | elapsed: %s finished',
    664                         (len(self._output),

---------------------------------------------------------------------------
Sub-process traceback:
---------------------------------------------------------------------------
ValueError                                         Fri Jan  2 12:13:54 2015
PID: 126664                Python 2.7.8: /home/hbunyam1/anaconda/bin/python
...........................................................................
/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.pyc in _randomized_logistic(X=memmap([[  6.93135506e-04,   8.93676615e-04,  -1...234095e-04,  -1.19037488e-04,   4.20921021e-04]]), y=array([1, 1, 1, ..., 0, 1, 1]), weights=array([ 0.5,  0. ,  0. ,  0.5,  0. ,  0.5,  0. ,...  0. ,  0. ,  0.5,  0. ,  0. ,  0. ,  0. ,  0.5]), mask=array([ True,  True,  True, ...,  True,  True,  True], dtype=bool), C=0.1, verbose=0, fit_intercept=True, tol=0.001)
    344     if issparse(X):
    345         size = len(weights)
    346         weight_dia = sparse.dia_matrix((1 - weights, 0), (size, size))
    347         X = X * weight_dia
    348     else:
--> 349         X *= (1 - weights)
    350
    351     C = np.atleast_1d(np.asarray(C, dtype=np.float))
    352     scores = np.zeros((X.shape[1], len(C)), dtype=np.bool)
    353

ValueError: output array is read-only
___________________________________________________________________________







Traceback (most recent call last):
  File "temp.py", line 88, in <module>
    train_randomized_logistic_regression()
  File "temp.py", line 82, in train_randomized_logistic_regression
    randomized_LR.fit(X, y)
  File "/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.py", line 110, in fit
    sample_fraction=self.sample_fraction, **params)
  File "/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/memory.py", line 281, in __call__
    return self.func(*args, **kwargs)
  File "/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.py", line 52, in _resample_model
    for _ in range(n_resampling)):
  File "/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 660, in __call__
    self.retrieve()
  File "/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 543, in retrieve
    raise exception_type(report)
sklearn.externals.joblib.my_exceptions.JoblibValueError: JoblibValueError
___________________________________________________________________________
Multiprocessing exception:
    ...........................................................................
/zfs/ilps-plexest/homedirs/hbunyam1/social_graph/temp.py in <module>()
     83
     84
     85
     86 if __name__ == '__main__':
     87
---> 88     train_randomized_logistic_regression()
     89
     90
     91
     92

...........................................................................
/zfs/ilps-plexest/homedirs/hbunyam1/social_graph/temp.py in train_randomized_logistic_regression()
     77     X = np.load( 'data/issuemakers/features/new_X.npy')
     78     y = np.load( 'data/issuemakers/features/new_y.npy')
     79
     80     randomized_LR = RandomizedLogisticRegression(C=0.1, n_jobs=32)
     81
---> 82     randomized_LR.fit(X, y)
        randomized_LR.fit = <bound method RandomizedLogisticRegression.fit o...d=0.25,
               tol=0.001, verbose=False)>
        X = array([[  1.01014900e+06,   7.29970000e+04,   2....460000e+04,   3.11428571e+01,   1.88100000e+03]])
        y = array([1, 1, 1, ..., 0, 1, 1])
     83
     84
     85
     86 if __name__ == '__main__':

...........................................................................
/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.py in fit(self=RandomizedLogisticRegression(C=0.1, fit_intercep...ld=0.25,
               tol=0.001, verbose=False), X=array([[  6.93135506e-04,   8.93676615e-04,  -1....234095e-04,  -1.19037488e-04,   4.20921021e-04]]), y=array([1, 1, 1, ..., 0, 1, 1]))
    105         )(
    106             estimator_func, X, y,
    107             scaling=self.scaling, n_resampling=self.n_resampling,
    108             n_jobs=self.n_jobs, verbose=self.verbose,
    109             pre_dispatch=self.pre_dispatch, random_state=self.random_state,
--> 110             sample_fraction=self.sample_fraction, **params)
        self.sample_fraction = 0.75
        params = {'C': 0.1, 'fit_intercept': True, 'tol': 0.001}
    111
    112         if scores_.ndim == 1:
    113             scores_ = scores_[:, np.newaxis]
    114         self.all_scores_ = scores_

...........................................................................
/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/memory.py in __call__(self=NotMemorizedFunc(func=<function _resample_model at 0x7fb5d7d12b18>), *args=(<function _randomized_logistic>, array([[  6.93135506e-04,   8.93676615e-04,  -1....234095e-04,  -1.19037488e-04,   4.20921021e-04]]), array([1, 1, 1, ..., 0, 1, 1])), **kwargs={'C': 0.1, 'fit_intercept': True, 'n_jobs': 32, 'n_resampling': 200, 'pre_dispatch': '3*n_jobs', 'random_state': None, 'sample_fraction': 0.75, 'scaling': 0.5, 'tol': 0.001, 'verbose': False})
    276     # Should be a light as possible (for speed)
    277     def __init__(self, func):
    278         self.func = func
    279
    280     def __call__(self, *args, **kwargs):
--> 281         return self.func(*args, **kwargs)
        self.func = <function _resample_model>
        args = (<function _randomized_logistic>, array([[  6.93135506e-04,   8.93676615e-04,  -1....234095e-04,  -1.19037488e-04,   4.20921021e-04]]), array([1, 1, 1, ..., 0, 1, 1]))
        kwargs = {'C': 0.1, 'fit_intercept': True, 'n_jobs': 32, 'n_resampling': 200, 'pre_dispatch': '3*n_jobs', 'random_state': None, 'sample_fraction': 0.75, 'scaling': 0.5, 'tol': 0.001, 'verbose': False}
    282
    283     def call_and_shelve(self, *args, **kwargs):
    284         return NotMemorizedResult(self.func(*args, **kwargs))
    285

...........................................................................
/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.py in _resample_model(estimator_func=<function _randomized_logistic>, X=array([[  6.93135506e-04,   8.93676615e-04,  -1....234095e-04,  -1.19037488e-04,   4.20921021e-04]]), y=array([1, 1, 1, ..., 0, 1, 1]), scaling=0.5, n_resampling=200, n_jobs=32, verbose=False, pre_dispatch='3*n_jobs', random_state=<mtrand.RandomState object>, sample_fraction=0.75, **params={'C': 0.1, 'fit_intercept': True, 'tol': 0.001})
     47                 X, y, weights=scaling * random_state.random_integers(
     48                     0, 1, size=(n_features,)),
     49                 mask=(random_state.rand(n_samples) < sample_fraction),
     50                 verbose=max(0, verbose - 1),
     51                 **params)
---> 52             for _ in range(n_resampling)):
        n_resampling = 200
     53         scores_ += active_set
     54
     55     scores_ /= n_resampling
     56     return scores_

...........................................................................
/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py in __call__(self=Parallel(n_jobs=32), iterable=<itertools.islice object>)
    655             if pre_dispatch == "all" or n_jobs == 1:
    656                 # The iterable was consumed all at once by the above for loop.
    657                 # No need to wait for async callbacks to trigger to
    658                 # consumption.
    659                 self._iterating = False
--> 660             self.retrieve()
        self.retrieve = <bound method Parallel.retrieve of Parallel(n_jobs=32)>
    661             # Make sure that we get a last message telling us we are done
    662             elapsed_time = time.time() - self._start_time
    663             self._print('Done %3i out of %3i | elapsed: %s finished',
    664                         (len(self._output),

    ---------------------------------------------------------------------------
    Sub-process traceback:
    ---------------------------------------------------------------------------
    ValueError                                         Fri Jan  2 12:13:54 2015
PID: 126664                Python 2.7.8: /home/hbunyam1/anaconda/bin/python
...........................................................................
/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.pyc in _randomized_logistic(X=memmap([[  6.93135506e-04,   8.93676615e-04,  -1...234095e-04,  -1.19037488e-04,   4.20921021e-04]]), y=array([1, 1, 1, ..., 0, 1, 1]), weights=array([ 0.5,  0. ,  0. ,  0.5,  0. ,  0.5,  0. ,...  0. ,  0. ,  0.5,  0. ,  0. ,  0. ,  0. ,  0.5]), mask=array([ True,  True,  True, ...,  True,  True,  True], dtype=bool), C=0.1, verbose=0, fit_intercept=True, tol=0.001)
    344     if issparse(X):
    345         size = len(weights)
    346         weight_dia = sparse.dia_matrix((1 - weights, 0), (size, size))
    347         X = X * weight_dia
    348     else:
--> 349         X *= (1 - weights)
    350
    351     C = np.atleast_1d(np.asarray(C, dtype=np.float))
    352     scores = np.zeros((X.shape[1], len(C)), dtype=np.bool)
    353

ValueError: output array is read-only
___________________________________________________________________________
[hbunyam1@zookst20 social_graph]$ python temp.py
Traceback (most recent call last):
  File "temp.py", line 88, in <module>
    train_randomized_logistic_regression()
  File "temp.py", line 82, in train_randomized_logistic_regression
    randomized_LR.fit(X, y)
  File "/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.py", line 110, in fit
    sample_fraction=self.sample_fraction, **params)
  File "/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/memory.py", line 281, in __call__
    return self.func(*args, **kwargs)
  File "/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.py", line 52, in _resample_model
    for _ in range(n_resampling)):
  File "/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 660, in __call__
    self.retrieve()
  File "/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 543, in retrieve
    raise exception_type(report)
sklearn.externals.joblib.my_exceptions.JoblibValueError: JoblibValueError
___________________________________________________________________________
Multiprocessing exception:
    ...........................................................................
/zfs/ilps-plexest/homedirs/hbunyam1/social_graph/temp.py in <module>()
     83
     84
     85
     86 if __name__ == '__main__':
     87
---> 88     train_randomized_logistic_regression()
     89
     90
     91
     92

...........................................................................
/zfs/ilps-plexest/homedirs/hbunyam1/social_graph/temp.py in train_randomized_logistic_regression()
     77     X = np.load( 'data/issuemakers/features/new_X.npy', mmap_mode='r+')
     78     y = np.load( 'data/issuemakers/features/new_y.npy', mmap_mode='r+')
     79
     80     randomized_LR = RandomizedLogisticRegression(C=0.1, n_jobs=32)
     81
---> 82     randomized_LR.fit(X, y)
        randomized_LR.fit = <bound method RandomizedLogisticRegression.fit o...d=0.25,
               tol=0.001, verbose=False)>
        X = memmap([[  1.01014900e+06,   7.29970000e+04,   2...460000e+04,   3.11428571e+01,   1.88100000e+03]])
        y = memmap([1, 1, 1, ..., 0, 1, 1])
     83
     84
     85
     86 if __name__ == '__main__':

...........................................................................
/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.py in fit(self=RandomizedLogisticRegression(C=0.1, fit_intercep...ld=0.25,
               tol=0.001, verbose=False), X=array([[  6.93135506e-04,   8.93676615e-04,  -1....234095e-04,  -1.19037488e-04,   4.20921021e-04]]), y=array([1, 1, 1, ..., 0, 1, 1]))
    105         )(
    106             estimator_func, X, y,
    107             scaling=self.scaling, n_resampling=self.n_resampling,
    108             n_jobs=self.n_jobs, verbose=self.verbose,
    109             pre_dispatch=self.pre_dispatch, random_state=self.random_state,
--> 110             sample_fraction=self.sample_fraction, **params)
        self.sample_fraction = 0.75
        params = {'C': 0.1, 'fit_intercept': True, 'tol': 0.001}
    111
    112         if scores_.ndim == 1:
    113             scores_ = scores_[:, np.newaxis]
    114         self.all_scores_ = scores_

...........................................................................
/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/memory.py in __call__(self=NotMemorizedFunc(func=<function _resample_model at 0x7f192c829b18>), *args=(<function _randomized_logistic>, array([[  6.93135506e-04,   8.93676615e-04,  -1....234095e-04,  -1.19037488e-04,   4.20921021e-04]]), array([1, 1, 1, ..., 0, 1, 1])), **kwargs={'C': 0.1, 'fit_intercept': True, 'n_jobs': 32, 'n_resampling': 200, 'pre_dispatch': '3*n_jobs', 'random_state': None, 'sample_fraction': 0.75, 'scaling': 0.5, 'tol': 0.001, 'verbose': False})
    276     # Should be a light as possible (for speed)
    277     def __init__(self, func):
    278         self.func = func
    279
    280     def __call__(self, *args, **kwargs):
--> 281         return self.func(*args, **kwargs)
        self.func = <function _resample_model>
        args = (<function _randomized_logistic>, array([[  6.93135506e-04,   8.93676615e-04,  -1....234095e-04,  -1.19037488e-04,   4.20921021e-04]]), array([1, 1, 1, ..., 0, 1, 1]))
        kwargs = {'C': 0.1, 'fit_intercept': True, 'n_jobs': 32, 'n_resampling': 200, 'pre_dispatch': '3*n_jobs', 'random_state': None, 'sample_fraction': 0.75, 'scaling': 0.5, 'tol': 0.001, 'verbose': False}
    282
    283     def call_and_shelve(self, *args, **kwargs):
    284         return NotMemorizedResult(self.func(*args, **kwargs))
    285

...........................................................................
/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.py in _resample_model(estimator_func=<function _randomized_logistic>, X=array([[  6.93135506e-04,   8.93676615e-04,  -1....234095e-04,  -1.19037488e-04,   4.20921021e-04]]), y=array([1, 1, 1, ..., 0, 1, 1]), scaling=0.5, n_resampling=200, n_jobs=32, verbose=False, pre_dispatch='3*n_jobs', random_state=<mtrand.RandomState object>, sample_fraction=0.75, **params={'C': 0.1, 'fit_intercept': True, 'tol': 0.001})
     47                 X, y, weights=scaling * random_state.random_integers(
     48                     0, 1, size=(n_features,)),
     49                 mask=(random_state.rand(n_samples) < sample_fraction),
     50                 verbose=max(0, verbose - 1),
     51                 **params)
---> 52             for _ in range(n_resampling)):
        n_resampling = 200
     53         scores_ += active_set
     54
     55     scores_ /= n_resampling
     56     return scores_

...........................................................................
/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py in __call__(self=Parallel(n_jobs=32), iterable=<itertools.islice object>)
    655             if pre_dispatch == "all" or n_jobs == 1:
    656                 # The iterable was consumed all at once by the above for loop.
    657                 # No need to wait for async callbacks to trigger to
    658                 # consumption.
    659                 self._iterating = False
--> 660             self.retrieve()
        self.retrieve = <bound method Parallel.retrieve of Parallel(n_jobs=32)>
    661             # Make sure that we get a last message telling us we are done
    662             elapsed_time = time.time() - self._start_time
    663             self._print('Done %3i out of %3i | elapsed: %s finished',
    664                         (len(self._output),

    ---------------------------------------------------------------------------
    Sub-process traceback:
    ---------------------------------------------------------------------------
    ValueError                                         Fri Jan  2 12:57:25 2015
PID: 127177                Python 2.7.8: /home/hbunyam1/anaconda/bin/python
...........................................................................
/home/hbunyam1/anaconda/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.pyc in _randomized_logistic(X=memmap([[  6.93135506e-04,   8.93676615e-04,  -1...234095e-04,  -1.19037488e-04,   4.20921021e-04]]), y=memmap([1, 1, 1, ..., 0, 0, 1]), weights=array([ 0.5,  0.5,  0. ,  0.5,  0.5,  0.5,  0.5,...  0. ,  0.5,  0. ,  0. ,  0.5,  0.5,  0.5,  0.5]), mask=array([ True,  True,  True, ..., False, False,  True], dtype=bool), C=0.1, verbose=0, fit_intercept=True, tol=0.001)
    344     if issparse(X):
    345         size = len(weights)
    346         weight_dia = sparse.dia_matrix((1 - weights, 0), (size, size))
    347         X = X * weight_dia
    348     else:
--> 349         X *= (1 - weights)
    350
    351     C = np.atleast_1d(np.asarray(C, dtype=np.float))
    352     scores = np.zeros((X.shape[1], len(C)), dtype=np.bool)
    353

ValueError: output array is read-only
___________________________________________________________________________

4 个答案:

答案 0 :(得分:3)

在32处理器Ubuntu服务器上运行该功能时,我收到了同样的错误。虽然问题仍然存在于n_jobs值大于1的情况下,但在将n_jobs值设置为默认值时会消失,即1. [如benbo描述的那样]

这是RandomizedLogisticRegression中的一个错误,其中内存中对同一个对象块的多次访问阻止了彼此访问它。

请参阅sklearn github页面,他们会解决此问题以及可能的深入修复:https://github.com/scikit-learn/scikit-learn/issues/4597

答案 1 :(得分:0)

您可能必须根据np.load('X.npy', mmap_mode='r+')的{​​{3}}使用numpy.load

答案 2 :(得分:0)

尝试更改作业数量,可能会为1开始。运行带有n_jobs = 20的RandomizedLogisticRegression时(在功能强大的机器上),我遇到了同样的错误。但是,当n_jobs设置为默认值1时,代码运行没有任何问题。

答案 3 :(得分:0)

原因是在设置n_jobs> 1(默认值为1M)时,Scikit-learn在内部并行调用Joblib库的 max_nbytes 参数。此参数的定义为:

  

传递给可触发的工作程序的数组大小的阈值   temp_folder中的自动内存映射。

更多详细信息可以在这里找到:https://joblib.readthedocs.io/en/latest/generated/joblib.Parallel.html#

因此,一旦数组的大小超过1M,joblib将引发错误ValueError: assignment destination is read-only。此错误很容易复制。让我们看下面的代码:

import numpy as np
from sklearn.linear_model import RandomizedLogisticRegression
# Create some random data
samples = 2621
X = np.random.randint(1,100, size=(samples, 50))
y = np.random.randint(100,200, size=(samples))

randomized_LR = RandomizedLogisticRegression(C=0.1, verbose=True, n_jobs=3)
randomized_LR.fit(X, y)

这将毫无问题地运行,如果我们使用print(X.nbytes/1024**2)查看X的大小,这将向我们表明X数组为0.9998321533203125Megabyte,因此不要太大 >。

如果我们再次运行相同的代码,但是将样本数更改为2622:

import numpy as np
from sklearn.linear_model import RandomizedLogisticRegression

samples = 2622
X = np.random.randint(1,100, size=(samples, 50))
print(X.nbytes/1024**2)
y = np.random.randint(100,200, size=(samples))

randomized_LR = RandomizedLogisticRegression(C=0.1, verbose=True, n_jobs=3)
randomized_LR.fit(X, y)

Python因ValueError: output array is read-only而崩溃,检查X数组的大小将告诉我们它是1.000213623046875Megabyte,因此太大