熊猫跳过时间表

时间:2018-09-20 14:04:09

标签: python pandas dataframe timetable

我试图跳过熊猫时间表中的一些停靠点,

    departure   arrival     in  out
0   a           b           1   0
1   b           '#delete'   2   0
2   '#delete'   d           0   3
3   d           e           1   1

我尝试跳过时间表中的#delete值,并加入in和out值:

    departure   arrival     in  out
0   a           b           1   0
1   b           d           2   3
2   d           e           1   1

有人知道如何实现这一目标吗?

编辑: 对Wen的解决方案进行一些修改对我有用:

df = df.mask(df=="#delete")
df.arrival = df.arrival.fillna(method='ffill')
df.departure = df.departure.fillna(method='bfill')
df = df.groupby(['arrival', 'departure']).sum()

2 个答案:

答案 0 :(得分:2)

更像是一个自定义的fillna问题

df=df.mask(df=="'#delete'")
df.departure=df.departure.ffill()

df.arrival=df.arrival.bfill()

df.groupby(['departure','arrival'],as_index=False).sum()
Out[761]: 
  departure arrival  in  out
0         a       b   1    0
1         b       d   2    3
2         d       e   1    1

答案 1 :(得分:1)

类似的东西(未经测试):

skipfrom = np.where(df.arrival == '#delete')[0]
skipto = skipfrom + 1
df.arrival[skipfrom] = df.arrival[skipto].values
df.out[skipfrom] = df.out[skipto].values
df = df[df.departure != '#delete']