每个客户都有多个计划时,他们就会重复。我想将状态设置为客户:
如果他们的每个产品都填充了“ canceled_at”,则取消客户状态,但是当不是每个产品都填充了“ canceled_at”,而是至少一个产品时,状态为“降级”,因为他丢失了产品。
>customer|canceled_at|status
x |3/27/2018 |
x | |
y |2/2/2018 |
y |2/2/2018 |
z |1/1/2018 |
a | |
我已经处于取消状态,现在我只需要降级
df['status']=(df.groupby('customer')['canceled_at'].
transform(lambda x: x.notna().all()).map({True:'canceled'})).fillna(df.status)
customer|canceled_at|status
x |3/27/2018 |downgrade
x | |downgrade
y |2/2/2018 |canceled
y |2/2/2018 |canceled
z |1/1/2018 |canceled
a | |
答案 0 :(得分:1)
在这里可以比较列中没有缺失值,并按Series
customer
与GroupBy.transform
和GroupBy.all
进行分组,
GroupBy.any
测试所有值True
(全部不丢失)或至少一个不丢失值(所有不丢失)并将其传递给numpy.select
:
g = df['canceled_at'].notna().groupby(df['customer'])
m1 = g.transform('all')
m2 = g.transform('any')
df['status'] = np.select([m1, m2],['canceled','downgrade'], np.nan)
print (df)
customer canceled_at status
0 x 3/27/2018 downgrade
1 x NaN downgrade
2 y 2/2/2018 canceled
3 y 2/2/2018 canceled
4 z 1/1/2018 canceled
5 a NaN nan
或者:
df['status'] = np.select([m1, m2],['canceled','downgrade'], '')
print (df)
customer canceled_at status
0 x 3/27/2018 downgrade
1 x NaN downgrade
2 y 2/2/2018 canceled
3 y 2/2/2018 canceled
4 z 1/1/2018 canceled
5 a NaN
如果仅NaN
个群组需要转换为downgrade
:
mask = df['canceled_at'].notna().groupby(df['customer']).transform('all')
df['status'] = np.where(mask,'canceled','downgrade')
print (df)
customer canceled_at status
0 x 3/27/2018 downgrade
1 x NaN downgrade
2 y 2/2/2018 canceled
3 y 2/2/2018 canceled
4 z 1/1/2018 canceled
5 a NaN downgrade
答案 1 :(得分:1)
这是一种实现方法:
import pandas as pd
def select_status(canceled):
c = canceled.count()
if c == 0:
status = ''
elif c == len(canceled):
status = 'canceled'
else:
status = 'downgrade'
return pd.Series(status, index=canceled.index)
df = pd.DataFrame({'customer': ['x', 'x', 'y', 'y', 'z', 'a'],
'canceled_at': ['3/27/2018', None, '2/2/2018', '2/2/2018', '1/1/2018', None]})
df['status'] = df.groupby('customer')['canceled_at'].apply(select_status)
print(df)
输出:
customer canceled_at status
0 x 3/27/2018 downgrade
1 x None downgrade
2 y 2/2/2018 canceled
3 y 2/2/2018 canceled
4 z 1/1/2018 canceled
5 a None