我有一个看起来像
的数据框LastName Date ObjectCol1 ObjectCol2 NumCol1 NumCol2 CurrentState ExpectedState
ABC March A1 A2
ABC June A1 A2
XYZ March A2 A2
XYZ June A2 A2
XYZ July A2 A2
AAA March D3 D2
AAA June D2 D1
DEF March C1 C1
DEF June C2 C3
DEF July C3 C3
我想创建一个新列,以便为姓(并且,如果Date值不是该姓的最大日期),那么如果Intermediate 2 == Intermediate 1(对于下一个后续日期值)以该姓氏为准),则新列的值应为..say“ Hit”,否则为“ Miss” 如果日期值为最大日期,则列值为“尚待观察”
所以结果看起来像
LastName Date ObjectCol1 ObjectCol2 NumCol1 NumCol2 CurrentState ExpectedState Result
ABC March A1 A2 Miss (because A2 here != Intermediate 1 value in the next row)
ABC June A1 A2 Yet to be seen
XYZ March A2 A2 Hit
XYZ June A2 A2 Hit
XYZ July A2 A2 Yet to be seen
AAA March D3 D2 Hit
AAA June D2 D1 Yet to be seen
DEF March C1 C1 Miss
DEF June C2 C3 Hit
DEF July C3 C3 Yet to be seen
答案 0 :(得分:0)
df['Pre-result'] = df.groupby(['LastName'])['CurrentState '].shift(-1)
df['Result'] = np.where(df['Pre-result'] == df['ExpectedState'], "Hit", "Miss")
df['Result'] = np.where(df['Pre-result'].isna(), "Yet to be seen", df['Result'])
del df['Pre-result']
LastName Date ObjectCol1 ObjectCol2 NumCol1 NumCol2 CurrentState ExpectedState Result
ABC March A1 A2 Miss (because A2 here != Intermediate 1 value in the next row)
ABC June A1 A2 Yet to be seen
XYZ March A2 A2 Hit
XYZ June A2 A2 Hit
XYZ July A2 A2 Yet to be seen
AAA March D3 D2 Hit
AAA June D2 D1 Yet to be seen
DEF March C1 C1 Miss
DEF June C2 C3 Hit
DEF July C3 C3 Yet to be seen