Question

我正在分析客户退货行为，并正在使用以下数据框df：

Customer_ID | Order | Store_ID | Date      | Item_ID | Count_of_units | Event_Return_Flag
ABC           123     1          23052016      A          -1               Y
ABC           345     1          23052016      B           1               0
ABC           567     1          24052016      C          -1               0

我需要添加另一列来查找在活动期间返回的客户（Event_Return_Flag = Y）并在同一天购买商品并存储。

换句话说，我想用以下逻辑添加一个标志df ['target']：

相同的Customer_ID，Store_ID，Date作为记录，Event_Return_Flag = Y
但是不同的Item_ID然后是具有Event_Return_Flag = Y
Count_of_units＆gt; 0

我不知道如何在python pandas中完成此任务。

我想通过连接Customer_ID，Store_ID和Date来创建密钥;然后通过Event_Return_flag分割文件并使用isin语句，如下所示：

df['key']=df['Customer_ID']+'_'+df['Store_ID']+'_'+df['Date'].apply(str)
df_1 = df.loc[df['Event_Return_Flag'] == 'Y']
df_2 = df.loc[df['Event_Return_Flag'] == '0']
df_3 = df2.loc[df['Count_of_units'] > 0]
df3['target'] = np.where(df3['key'].isin(df1['key']), 'Y', 0)

这种做法似乎有些错误，但我无法想出更好的东西。我在np.where：

的最后一行收到此错误消息

C:\Users\xxx\AppData\Local\Continuum\Anaconda2\lib\site-packages\ipykernel\__main__.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  if __name__ == '__main__':

我尝试了这一行，但无法弄清楚如何根据列Event_Return_Flag匹配行

df['target'] = (np.where((df.Item_Units_S > 0)&(df.groupby(['key','Item_ID']).Event_Return_flag.transform('nunique') > 1), 'Y', ''))

Python Pandas：根据复杂的if语句添加列

0 个答案: