我有以下数据框my_df
:
Person event time
---------------------------------
John A 2017-10-11
John B 2017-10-12
John C 2017-10-14
John D 2017-10-15
Ann X 2017-09-01
Ann Y 2017-09-02
Dave M 2017-10-05
Dave N 2017-10-07
Dave Q 2017-10-20
我想创建一个新列,即(事件,时间)对。它应该看起来像:
Person event time event_time
------------------------------------------------------
John A 2017-10-11 (A, 2017-10-11)
John B 2017-10-12 (B, 2017-10-12)
John C 2017-10-14 (C, 2017-10-14)
John D 2017-10-15 (D, 2017-10-15)
Ann X 2017-09-01 (X, 2017-09-01)
Ann Y 2017-09-02 (Y, 2017-09-02)
Dave M 2017-10-05 (M, 2017-10-05)
Dave N 2017-10-07 (N, 2017-10-07)
Dave Q 2017-10-20 (Q, 2017-10-20)
这是我的代码:
my_df['event_time'] = my_df.apply(lambda row: (row['event'] , row['time']), axis=1)
但是我收到了以下错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in create_block_manager_from_arrays(arrays, names, axes)
4309 blocks = form_blocks(arrays, names, axes)
-> 4310 mgr = BlockManager(blocks, axes)
4311 mgr._consolidate_inplace()
/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in __init__(self, blocks, axes, do_integrity_check, fastpath)
2794 if do_integrity_check:
-> 2795 self._verify_integrity()
2796
/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in _verify_integrity(self)
3005 if block._verify_integrity and block.shape[1:] != mgr_shape[1:]:
-> 3006 construction_error(tot_items, block.shape[1:], self.axes)
3007 if len(self.items) != tot_items:
/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in construction_error(tot_items, block_shape, axes, e)
4279 raise ValueError("Shape of passed values is {0}, indices imply {1}".format(
-> 4280 passed, implied))
4281
ValueError: Shape of passed values is (128, 2), indices imply (128, 3)
知道我在代码中做错了什么吗?谢谢!
答案 0 :(得分:2)
您可以使用:
my_df['event_time'] = my_df[['event','time']].apply(tuple, axis=1)
或者:
my_df['event_time'] = tuple(zip(my_df['event'], my_df['time']))
或者:
my_df['event_time'] = [tuple(x) for x in my_df[['event','time']].values.tolist()]
所有回报:
print (my_df)
Person event time event_time
0 John A 2017-10-11 (A, 2017-10-11)
1 John B 2017-10-12 (B, 2017-10-12)
2 John C 2017-10-14 (C, 2017-10-14)
3 John D 2017-10-15 (D, 2017-10-15)
4 Ann X 2017-09-01 (X, 2017-09-01)
5 Ann Y 2017-09-02 (Y, 2017-09-02)
6 Dave M 2017-10-05 (M, 2017-10-05)
7 Dave N 2017-10-07 (N, 2017-10-07)
8 Dave Q 2017-10-20 (Q, 2017-10-20)
答案 1 :(得分:1)
没有apply
df.assign(event_time=list(zip(df.event,df.time)))
Out[1011]:
Person event time event_time
0 John A 2017-10-11 (A, 2017-10-11)
1 John B 2017-10-12 (B, 2017-10-12)
2 John C 2017-10-14 (C, 2017-10-14)
3 John D 2017-10-15 (D, 2017-10-15)
4 Ann X 2017-09-01 (X, 2017-09-01)
5 Ann Y 2017-09-02 (Y, 2017-09-02)
6 Dave M 2017-10-05 (M, 2017-10-05)
7 Dave N 2017-10-07 (N, 2017-10-07)
8 Dave Q 2017-10-20 (Q, 2017-10-20)