apply() - ValueError:传递值的形状为X,index表示Y.

时间:2017-11-10 12:40:35

标签: pandas

我有这个数据框:

df = pd.DataFrame([[1,2,3], [2,3,4]], columns=['a', 'b', 'c'])

我可以这样改变它:

pd.DataFrame(df.apply(lambda x: [x.a, x.b], 1).tolist())

或者像这样:

pd.DataFrame(df.apply(lambda x: [1, 2, 3, 4], 1).tolist())

现在让我们来看看df_train

df_train.dtypes

给我们

actionDate              datetime64[ns]
actionType                      object
clientSubmitDate        datetime64[ns]
content                         object
conversationId                  object
creationDate            datetime64[ns]
folder                          object
hasForwarded                      bool
hasReplied                        bool
lastModified            datetime64[ns]
messageDelivery         datetime64[ns]
originalConversation            object
recipients                      object
sender                          object
subject                         object
dtype: object

就像我这样做:

df_train.apply(lambda x: [1, 2], 1)

我得到的是:

ValueError: Shape of passed values is (10, 2), indices imply (10, 15)

我不知道为什么因为我向你发誓昨天有效。

我的实际目标是像这样转换数据框:

df_train.apply(lambda x: [x.actionType == 'DELETED', x.hasForwarded, x.hasReplied], 1).tolist()

但由于某种原因,我得到了这个ValueError,我只是不知道问题所在。

df_train[['actionType', 'hasReplied', 'hasForwarded']].head()

显示

       actionType  hasReplied  hasForwarded
id                                         
129931    DELETED       False         False
123345    DELETED       False         False
136596    FORWARD       False         False
123344    DELETED       False         False
123343    DELETED       False         False

完整堆栈跟踪:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/miniconda3/envs/daimler/lib/python3.5/site-packages/pandas/core/internals.py in create_block_manager_from_arrays(arrays, names, axes)
   4632         blocks = form_blocks(arrays, names, axes)
-> 4633         mgr = BlockManager(blocks, axes)
   4634         mgr._consolidate_inplace()

~/miniconda3/envs/daimler/lib/python3.5/site-packages/pandas/core/internals.py in __init__(self, blocks, axes, do_integrity_check, fastpath)
   3027         if do_integrity_check:
-> 3028             self._verify_integrity()
   3029 

~/miniconda3/envs/daimler/lib/python3.5/site-packages/pandas/core/internals.py in _verify_integrity(self)
   3238             if block._verify_integrity and block.shape[1:] != mgr_shape[1:]:
-> 3239                 construction_error(tot_items, block.shape[1:], self.axes)
   3240         if len(self.items) != tot_items:

~/miniconda3/envs/daimler/lib/python3.5/site-packages/pandas/core/internals.py in construction_error(tot_items, block_shape, axes, e)
   4602     raise ValueError("Shape of passed values is {0}, indices imply {1}".format(
-> 4603         passed, implied))
   4604 

ValueError: Shape of passed values is (130182, 3), indices imply (130182, 15)

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-114-59888fe52c27> in <module>()
----> 1 df_train.apply(lambda x: [x.actionType == 'DELETED', x.hasForwarded, x.hasReplied], 1)

~/miniconda3/envs/daimler/lib/python3.5/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   4852                         f, axis,
   4853                         reduce=reduce,
-> 4854                         ignore_failures=ignore_failures)
   4855             else:
   4856                 return self._apply_broadcast(f, axis)

~/miniconda3/envs/daimler/lib/python3.5/site-packages/pandas/core/frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)
   4965                 index = None
   4966 
-> 4967             result = self._constructor(data=results, index=index)
   4968             result.columns = res_index
   4969 

~/miniconda3/envs/daimler/lib/python3.5/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    328                                  dtype=dtype, copy=copy)
    329         elif isinstance(data, dict):
--> 330             mgr = self._init_dict(data, index, columns, dtype=dtype)
    331         elif isinstance(data, ma.MaskedArray):
    332             import numpy.ma.mrecords as mrecords

~/miniconda3/envs/daimler/lib/python3.5/site-packages/pandas/core/frame.py in _init_dict(self, data, index, columns, dtype)
    459             arrays = [data[k] for k in keys]
    460 
--> 461         return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    462 
    463     def _init_ndarray(self, values, index, columns, dtype=None, copy=False):

~/miniconda3/envs/daimler/lib/python3.5/site-packages/pandas/core/frame.py in _arrays_to_mgr(arrays, arr_names, index, columns, dtype)
   6138     axes = [_ensure_index(columns), _ensure_index(index)]
   6139 
-> 6140     return create_block_manager_from_arrays(arrays, arr_names, axes)
   6141 
   6142 

~/miniconda3/envs/daimler/lib/python3.5/site-packages/pandas/core/internals.py in create_block_manager_from_arrays(arrays, names, axes)
   4635         return mgr
   4636     except ValueError as e:
-> 4637         construction_error(len(arrays), arrays[0].shape, axes, e)
   4638 
   4639 

~/miniconda3/envs/daimler/lib/python3.5/site-packages/pandas/core/internals.py in construction_error(tot_items, block_shape, axes, e)
   4601         raise ValueError("Empty data passed with indices specified.")
   4602     raise ValueError("Shape of passed values is {0}, indices imply {1}".format(
-> 4603         passed, implied))
   4604 
   4605 

ValueError: Shape of passed values is (130182, 3), indices imply (130182, 15)

0 个答案:

没有答案