我试图将帐户标记为新的,当前的,丢失的或归还的帐户,但逻辑上有麻烦。行索引是帐户,列是年份,值是1和0,表示该帐户是否处于活动状态。到目前为止,这是我想出的。我不确定这是否行得通,或者我是否已经关闭,也不确定逻辑如何寻找回头客。
df2是原始数据帧,df3 = df2.shift(periods=1,axis=1)
def differences():
if df2 != df3 & df2 == 1:
return "New"
elif df2 != df3 & df2 ==0:
return "Lost"
elif df2 == df3 & df2 ==0:
return ""
else:
return "Continuing"
differences()
`
当我运行此代码时,出现以下错误:
couldn't find matching opcode for 'and_bdl'
答案 0 :(得分:1)
以下代码逻辑可能适合您的情况。
编辑:根据您的评论,我修改了代码,以便检查除最后一个列以外的所有列。
import pandas as pd
str="""account 2019 2018 2017 2016 2015
alex 1 0 0 0 0
joe 0 0 1 0 0
boss 1 1 1 1 1
smith 1 1 0 1 0"""
df = pd.read_csv(pd.io.common.StringIO(str), sep='\s+', index_col='account')
df
#Out[46]:
# 2019 2018 2017 2016 2015
#account
#alex 1 0 0 0 0
#joe 0 0 1 0 0
#boss 1 1 1 1 1
#smith 1 1 0 1 0
# find account status per-year
def account_status(x):
status = []
n = x.size
for i in range(n-1):
if x.iloc[i] == 1:
# if all rest are '0'
if x.iloc[i+1:].eq(0).all():
status.extend(['new'] + [None]*(n-i-2))
break
# if the previous year is '0'
elif x.iloc[i+1] == 0:
status.append('returning')
else:
status.append('continuing')
else:
# at least one '1' in previous years
if x.iloc[i+1:].eq(1).any():
status.append('lost')
else:
status.extend([None] * (n-i-1))
break
return status
s = df.apply(account_status, axis=1).apply(pd.Series)
s.columns = df.columns[:-1]
s
#Out[57]:
# 2019 2018 2017 2016
#account
#alex new None None None
#joe lost lost new None
#boss continuing continuing continuing continuing
#smith continuing returning lost new