Question

我正在尝试创建一个数据集以检查我的Logistic回归算法，但无法从字典中创建一个熊猫DataFrame。我收到“数据必须是一维的”例外。

    x1 = np.random.random(size=(10,1))*2
    x2 = np.random.random(size=(10,1))*2

    x3 = np.random.random(size=(10,1))*2 + 2
    x4 = np.random.random(size=(10,1))*2 + 2

    y0 = np.zeros(shape=(10,1))
    y1 = np.ones(shape=(10,1))

    plt.scatter(x1,x2, color='g', marker='o')
    plt.scatter(x3,x4, color='r', marker='o')

    dict_data = { 'X1':np.concatenate((x1,x3)), 
                  'X2':np.concatenate((x2,x4)),
                   'Y':np.concatenate((y0,y1))}

    data = pd.DataFrame(dict_data, index=np.arange(20))

我将其作为输出，错误数据必须为1维。

    --------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-49-fe81f079ebc6> in <module>
     13 dict_data = { 'X1':np.concatenate((x1,x3)), 'X2':np.concatenate((x2,x4)),'Y':np.concatenate((y0,y1))}
     14 #print(dict_data.shape)
---> 15 data = pd.DataFrame(dict_data, index=np.arange(20).reshape(20))

~/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    328                                  dtype=dtype, copy=copy)
    329         elif isinstance(data, dict):
--> 330             mgr = self._init_dict(data, index, columns, dtype=dtype)
    331         elif isinstance(data, ma.MaskedArray):
    332             import numpy.ma.mrecords as mrecords

~/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in _init_dict(self, data, index, columns, dtype)
    459             arrays = [data[k] for k in keys]
    460 
--> 461         return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    462 
    463     def _init_ndarray(self, values, index, columns, dtype=None, copy=False):

~/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in _arrays_to_mgr(arrays, arr_names, index, columns, dtype)
   6166 
   6167     # don't force copy because getting jammed in an ndarray anyway
-> 6168     arrays = _homogenize(arrays, index, dtype)
   6169 
   6170     # from BlockManager perspective

~/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in _homogenize(data, index, dtype)
   6475                 v = lib.fast_multiget(v, oindex.values, default=np.nan)
   6476             v = _sanitize_array(v, index, dtype=dtype, copy=False,
-> 6477                                 raise_cast_failure=False)
   6478 
   6479         homogenized.append(v)

~/anaconda3/lib/python3.6/site-packages/pandas/core/series.py in _sanitize_array(data, index, dtype, copy, raise_cast_failure)
   3273     elif subarr.ndim > 1:
   3274         if isinstance(data, np.ndarray):
-> 3275             raise Exception('Data must be 1-dimensional')
   3276         else:
   3277             subarr = _asarray_tuplesafe(data, dtype=dtype)

Exception: Data must be 1-dimensional

Answer 1

np.random.random(size=(10,1))生成形状为（10，1）的二维数组，但是pandas将DataFrame构造为一维数组的集合。

因此，使用np.random.random(size=(10))制作一维数组，然后将其用于制作DataFrame。

如何在python中修复“数据必须为一维”异常

1 个答案: