我有一个数据帧(df),其头部看起来像:
BB NEW_DATE PICKED
1123 03/10/2018 03/10/2018
1123 04/10/2018 04/10/2018
1123 05/10/2018 05/10/2018
1123 09/10/2018 09/10/2018
1123 04/01/2013 01/04/2013
1123 07/01/2013 07/01/2013
1123 08/01/2013 08/01/2013
我正在尝试添加一个名为FINAL
的新列,该列的值部分取决于FINAL
的先前行值。
if df['PICKED'] < df['FINAL'].shift(-1):
if df['NEW_DATE'].isnumeric():
df['FINAL'] = df['NEW_DATE']
else:
df['FINAL'] = df['PICKED']
df['FINAL'] = df['PICKED']
对于每行,如果PICKED
小于先前的行值FINAL
,则如果NEW_DATE
是有效日期,则当前的FINAL
等于当前行值NEW_DATE,否则FINAL等于PICKED
。如果PICKED
大于或等于FINAL
的前一行值,则FINAL
等于PICKED
。
因此在上面的数据框中,FINAL
列看起来像这样;
BB NEW_DATE PICKED FINAL
1123 03/10/2018 03/10/2018 03/10/2018
1123 04/10/2018 04/10/2018 04/10/2018
1123 05/10/2018 05/10/2018 05/10/2018
1123 09/10/2018 09/10/2018 09/10/2018
1123 04/01/2013 01/04/2013 04/01/2013
1123 07/01/2013 07/01/2013 07/01/2013
1123 08/01/2013 08/01/2013 08/01/2013
我尝试使用以下代码进行编码但没有任何成功
df['FINAL'] = np.where(df['PICKED'] < df['FINAL'].shift(-1), df.NEW_DATE.fillna(df.DATE), df['PICKED'])
我也尝试过:
for row in df.iterrows:
if index == 0 :
row['FINAL'] = row['NEW_DATE']
else:
if row['PICKED'] < row['FINAL'].shift(-1):
if isinstance(row['NEW_DATE'], pd.DatetimeIndex):
row['FINAL'] = row['NEW_DATE']
else:
row['FINAL'] = row['PICKED']
else:
row['FINAL'] = row['PICKED']
但出现错误:TypeError: 'method' object is not iterable
答案 0 :(得分:1)
我想不出没有循环的方法,所以这是一种方法。
# Initalise the first value of FINAL that will be the previous value
# in the first iteration of the loop
prev_final = df.loc[0,'PICKED']
#create a list containing the data to create the column FINAL after
list_final = [prev_final]
# loop over the rows with itertuples, not the first row as it has been take care of before
for new_date, picked in df.loc[1:,['NEW_DATE','PICKED']].itertuples(index=False):
# check the two conditions at once as if both are not met, then the value in FINAL is from PICKED
if (picked < prev_final) & isinstance(new_date, pd.datetime):
# add the value from NEW_DATE
list_final.append(new_date)
# and update the prev_final for the next iteration of the loop
prev_final = new_date
else: # same idea if conditions not met
list_final.append(picked)
prev_final = picked
#outside of the loop, create the column with the list
df['FINAL'] = list_final
print(df)
BB NEW_DATE PICKED FINAL
0 1123 2018-03-10 2018-03-10 2018-03-10
1 1123 2018-04-10 2018-04-10 2018-04-10
2 1123 2018-05-10 2018-05-10 2018-05-10
3 1123 2018-09-10 2018-09-10 2018-09-10
4 1123 2013-04-01 2013-01-04 2013-04-01
5 1123 2013-07-01 2013-07-01 2013-07-01
6 1123 2013-08-01 2013-08-01 2013-08-01