我有以下数据框:
num_calls
我的目标是在遇到三个直线轨迹时改变方向。因此,在此示例中,我的新列将为 Resulting_Direction (假设它最初不在df中)。
目前,我正在通过逐行if语句来做到这一点。但是,这非常缓慢且效率低下。我希望使用遮罩来设置结果方向,使其变成行,然后使用fillna(method =“ ffill”)。这是我的尝试:
Trajectory Direction Resulting_Direction
STRAIGHT NORTH NORTH
STRAIGHT NaN NORTH
LEFT NaN WEST
LEFT NaN WEST
LEFT NaN WEST
STRAIGHT NaN WEST
STRAIGHT NaN WEST
RIGHT NaN NORTH
RIGHT NaN NORTH
RIGHT NaN NORTH
我认为我的问题出在 df ['direction']。dropna()。shift()。如何在同一列中找到不是NaN的先前值?
答案 0 :(得分:1)
IIUC,问题在于检测方向变化的位置,假设是在3个连续的变化命令的开头:
thresh = 3
# mark the consecutive direction commands
blocks = df.Trajectory.ne(df.Trajectory.shift()).cumsum()
# group by blocks
groups = df.groupby(blocks)
# enumerate each block
df['mask'] = groups.cumcount()
# shift up to mark the beginning
# mod thresh to divide each block into small block of thresh
df['mask'] = groups['mask'].shift(1-thresh) % thresh
# for conversion of direction to letters:
changes = {'LEFT': -1,'RIGHT':1}
# all the directions
directions = ['NORTH', 'EAST', 'SOUTH', 'WEST']
# update directions according to the start direction
start = df['Direction'].iloc[0]
start_idx = directions.index(start)
directions = {k%4: v for k,v in enumerate(directions, start=start_idx)}
# update direction changes
direction_changes = (df.Trajectory
.where(df['mask'].eq(2)) # where the changes happends
.map(changes) # replace the changes with number
.fillna(0) # where no direction change is 0
)
# mod 4 for the 4 direction
# and map
df['Resulting_Direction'] = (direction_changes.cumsum() % 4).map(directions)
输出:
Trajectory Direction Resulting_Direction mask
0 STRAIGHT NORTH NORTH NaN
1 STRAIGHT NaN NORTH NaN
2 LEFT NaN WEST 2.0
3 LEFT NaN WEST NaN
4 LEFT NaN WEST NaN
5 STRAIGHT NaN WEST NaN
6 STRAIGHT NaN WEST NaN
7 RIGHT NaN NORTH 2.0
8 RIGHT NaN NORTH NaN
9 RIGHT NaN NORTH NaN