Question

我有一个数据框，其中包含一些行程数据，其中每一行代表每个点/位置的数据。

trip_id, sequence, location, start_time
101, 1, point_a, 2020-05-01 00:00:01
101, 2, point_b, 2020-05-01 00:04:01
101, 3, point_c, 2020-05-01 00:14:01
102, 1, point_x, 2020-05-11 00:13:21
102, 2, point_y, 2020-05-11 00:14:01
103, 1, point_z, 2020-05-11 00:14:01
103, 3, point_za, 2020-05-11 00:20:01

我正在尝试创建一个新数据框，该数据框的数据位于同一行中两个连续点/位置之间，如下所示：

trip_id, sequence, start_location, start_time, sequence, end_location, end_time
101, 1, point_a, 2020-05-01 00:00:01, 2, point_b, 2020-05-01 00:04:01
101, 2, point_b, 2020-05-01 00:04:01, 3, point_c, 2020-05-01 00:14:01
102, 1, point_x, 2020-05-11 00:13:21, 2, point_y, 2020-05-11 00:14:01
103, 1, point_z, 2020-05-11 00:14:01, 3, point_za, 2020-05-11 00:20:01

Answer 1

您可以删除顶部/底部的行并合并：

bottoms = df[df.trip_id.duplicated()].reset_index(drop=True)
tops = df[df.trip_id.duplicated(keep='last')].reset_index(drop=True)
# rename bottoms' columns
bottoms.columns = ['trip_id', 'sequence', 'end_location', 'end_time']

pd.concat((tops,bottoms), axis=1)

熊猫-根据序列映射数据

1 个答案: