我正在创建一个名为dfLostBusiness的新数据框,以将符合特定条件的订单包含在名为df的原始数据框中,因此被视为“失落的业务”。我在df上使用布尔索引,然后将结果附加到dfLostBusiness。我期望dfLostBusiness可以将所有被屏蔽的值相互附加,以产生一个dfLostBusiness。 1500行,就像我在SQL中输出的一样。相反,我觉得无论出于何种原因,每个屏蔽命令都会替换O,X以外的所有值。我也尝试过使用遮罩命令的顺序。我正在使用一个ipython环境,该环境已多次重新启动,但没有不同的结果,因此一定有一些我不了解的事情正在发生。
使用附加:
dfLostBusiness = pd.DataFrame()
m = (df['OrderType'].str.lower() == 'o') & (df['OrderStatus'].str.lower() == 'x')
dfLostBusiness = df[m].reset_index(drop=True)
dfLostBusiness[['OrderType', 'OrderStatus']].shape: (421, 2)
dfLostBusiness Preview:
OrderType OrderStatus
0 O X
1 O X
2 O X
3 O X
4 O X
m = (df['OrderType'].str.lower() == 'c')
dfLostBusiness.append(df[m], ignore_index=True)
dfLostBusiness[['OrderType', 'OrderStatus']].shape: (594, 2)
dfLostBusiness Preview:
OrderType OrderStatus
0 O X
1 O X
2 C S
3 C S
4 C C
m = ((df['OrderType'].str.lower() == 'q') & ((datetime.datetime.now() - df['OrderDate']) > pd.Timedelta(30, 'D')))
dfLostBusiness.append(df[m], ignore_index=True)
dfLostBusiness[['OrderType', 'OrderStatus']].shape: (1442, 2)
At this point, dfLostBusiness[dfLostBusiness['OrderType'].str.lower() == 'c'] outputs an EmptyDataframe
dfLostBusiness Preview:
OrderType OrderStatus
0 O X
1 O X
2 Q X
3 Q X
4 Q Q
m = ((df['OrderType'].str.lower() == 'q') & (df['OrderStatus'].str.lower() == 'r'))
dfLostBusiness.append(df[m], ignore_index=True)
dfLostBusiness[['OrderType', 'OrderStatus']].shape: (425, 2)
Here the rows drop to 425 from 1442, and there are only O,X and Q,R
dfLostBusiness Preview:
OrderType OrderStatus
0 O X
1 O X
2 O X
3 O X
4 Q R
使用concat,我得到了类似的意外结果:
dfLostBusiness = pd.DataFrame()
m = (df['OrderType'].str.lower() == 'o') & (df['OrderStatus'].str.lower() == 'x')
dfLostBusiness = df[m].reset_index(drop = True)
m = (df['OrderType'].str.lower() == 'c')
pd.concat([dfLostBusiness, df[m]], ignore_index = True)
m = ((df['OrderType'].str.lower() == 'q') & ((datetime.datetime.now() - df['OrderDate']) > pd.Timedelta(30, 'D')))
pd.concat([dfLostBusiness, df[m]], ignore_index = True)
m = ((df['OrderType'].str.lower() == 'q') & (df['OrderStatus'].str.lower() == 'r'))
pd.concat([dfLostBusiness, df[m]], ignore_index = True)