在python中应用函数

时间:2018-02-20 03:46:48

标签: python pandas

我正在尝试创建一个新列,将所有日期信息存储为列表。它在单行上运行良好。但是,当函数应用于整个数据表时,会引发错误。有人可以帮忙吗?感谢。

功能,

def res(dr):
    return [dr["Current Date"],dr["End Date"],dr["Begin Date"]]

数据表

Listed Code Current Date    Frequency   Price   Residual    Coupon  End Date    Begin Date
    696      1997-06-30               1     113.49     100  112.558174  2006-06-13  1996-06-14
    696      1997-05-31               1     113.49     100  112.558174  2006-06-13  1996-06-14

返回在单行上运行的列表

res(bond_info.iloc[0,:])
[Timestamp('1997-06-30 00:00:00'),Timestamp('2006-06-13 00:00:00'),Timestamp('1996-06-14 00:00:00')]

在整个数据表中引发错误,

bond_info.apply(res,axis=1)
ValueError                                Traceback (most recent call last)
F:\Anaconda3\lib\site-packages\pandas\core\internals.py in create_block_manager_from_arrays(arrays, names, axes)
   4309         blocks = form_blocks(arrays, names, axes)
-> 4310         mgr = BlockManager(blocks, axes)
   4311         mgr._consolidate_inplace()

F:\Anaconda3\lib\site-packages\pandas\core\internals.py in __init__(self, blocks, axes, do_integrity_check, fastpath)
   2794         if do_integrity_check:
-> 2795             self._verify_integrity()
   2796 

F:\Anaconda3\lib\site-packages\pandas\core\internals.py in _verify_integrity(self)
   3005             if block._verify_integrity and block.shape[1:] != mgr_shape[1:]:
-> 3006                 construction_error(tot_items, block.shape[1:], self.axes)
   3007         if len(self.items) != tot_items:

F:\Anaconda3\lib\site-packages\pandas\core\internals.py in construction_error(tot_items, block_shape, axes, e)
   4279     raise ValueError("Shape of passed values is {0}, indices imply {1}".format(
-> 4280         passed, implied))
   4281 

ValueError: Shape of passed values is (2, 3), indices imply (2, 8)

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-104-e9d749798573> in <module>()
----> 1 bond_info.apply(res,axis=1)

F:\Anaconda3\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   4358                         f, axis,
   4359                         reduce=reduce,
-> 4360                         ignore_failures=ignore_failures)
   4361             else:
   4362                 return self._apply_broadcast(f, axis)

F:\Anaconda3\lib\site-packages\pandas\core\frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)
   4471                 index = None
   4472 
-> 4473             result = self._constructor(data=results, index=index)
   4474             result.columns = res_index
   4475 

F:\Anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy)
    273                                  dtype=dtype, copy=copy)
    274         elif isinstance(data, dict):
--> 275             mgr = self._init_dict(data, index, columns, dtype=dtype)
    276         elif isinstance(data, ma.MaskedArray):
    277             import numpy.ma.mrecords as mrecords

F:\Anaconda3\lib\site-packages\pandas\core\frame.py in _init_dict(self, data, index, columns, dtype)
    409             arrays = [data[k] for k in keys]
    410 
--> 411         return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    412 
    413     def _init_ndarray(self, values, index, columns, dtype=None, copy=False):

F:\Anaconda3\lib\site-packages\pandas\core\frame.py in _arrays_to_mgr(arrays, arr_names, index, columns, dtype)
   5602     axes = [_ensure_index(columns), _ensure_index(index)]
   5603 
-> 5604     return create_block_manager_from_arrays(arrays, arr_names, axes)
   5605 
   5606 

F:\Anaconda3\lib\site-packages\pandas\core\internals.py in create_block_manager_from_arrays(arrays, names, axes)
   4312         return mgr
   4313     except ValueError as e:
-> 4314         construction_error(len(arrays), arrays[0].shape, axes, e)
   4315 
   4316 

F:\Anaconda3\lib\site-packages\pandas\core\internals.py in construction_error(tot_items, block_shape, axes, e)
   4278         raise ValueError("Empty data passed with indices specified.")
   4279     raise ValueError("Shape of passed values is {0}, indices imply {1}".format(
-> 4280         passed, implied))
   4281 
   4282 

ValueError: Shape of passed values is (2, 3), indices imply (2, 8)

1 个答案:

答案 0 :(得分:2)

选项1
使用filter + tolist。您在这里不需要apply

df.filter(regex='.*Date$').values.tolist()

[['1997-06-30', '2006-06-13', '1996-06-14'],
 ['1997-05-31', '2006-06-13', '1996-06-14']]

选项2
或者,使用str.endswith + loc

df.loc[:, df.columns.str.endswith('Date')].values.tolist()


[['1997-06-30', '2006-06-13', '1996-06-14'],
 ['1997-05-31', '2006-06-13', '1996-06-14']]

选项3
列索引

df[['Current Date', 'End Date', 'Begin Date']].values.tolist()

[['1997-06-30', '2006-06-13', '1996-06-14'],
 ['1997-05-31', '2006-06-13', '1996-06-14']]