我正在分析一个如下表。当事件列显示offer_id
时,transaction
列中有多个“无”值。我只想在前一个事件为offer viewed
时才用正向值填充None,否则,将None值填充为0或将其保留为none。
数据框为:
df = pd.DataFrame({'event': ['offer_received', 'offer_viewed','transaction', 'transaction', 'offer_received', 'transaction'], 'user':['A','A','A','A','A','A'], 'value':[0, 0, 1.09, 2.55, 0, 3.02], 'offer_id': ['0b1e1539f2cc45b7b9fa7c272da2e1d7', '0b1e1539f2cc45b7b9fa7c272da2e1d7', 'None', 'None', '3f207df678b143eea3cee63160fa8bed', 'None'], 'days':[0, 0.25, 9.75, 11, 0,9.75]})
event user value offer_id days
offer received A 0.00 0b1e1539f2cc45b7b9fa7c272da2e1d7 0.00
offer viewed A 0.00 0b1e1539f2cc45b7b9fa7c272da2e1d7 0.25
transaction A 1.09 None 9.75
transaction A 2.55 None 11
offer received A 0.00 3f207df678b143eea3cee63160fa8bed 0.00
transaction A 3,02 None 9.75
我尝试使用df.offer_id.fillna(method = 'ffill')
,但是当上一个事件为offer_viewed
时,如何将条件放在事件列上,然后填充{{ 1}},方法是使用offer_id
。
我的预期结果将是这样:
transaction
答案 0 :(得分:0)
我认为您可以使用shift()
,ffill()
和where()
到达那里:
df = pd.DataFrame({'e': ['r', 'v', 't', 'r', 't'], 'oid': [1, 1, np.nan, 2, np.nan]})
df
# e oid
# 0 r 1.0
# 1 v 1.0
# 2 t NaN
# 3 r 2.0
# 4 t NaN
df.oid = df.oid.ffill().where(df.e.shift() == 'v', df.oid)
df
# e oid
# 0 r 1.0
# 1 v 1.0
# 2 t 1.0
# 3 r 2.0
# 4 t NaN
您甚至可以跳过ffill()
并使用shift()
两次:
df = pd.DataFrame({'e': ['r', 'v', 't', 'r', 't'], 'oid': [1, 1, np.nan, 2, np.nan]})
df.oid = df.oid.shift().where(df.e.shift() == 'v', df.oid)
df
# e oid
# 0 r 1.0
# 1 v 1.0
# 2 t 1.0
# 3 r 2.0
# 4 t NaN