优化熊猫迭代

时间:2020-04-02 20:29:31

标签: python-3.x pandas optimization iteration

productDocuement
           |___ name: hamburger
           |___ price: 5
           |___ type: typeDocument/types

我已根据上述逻辑使用itterrows()创建“丢失/保留的客户”列。

如果客户连续两年重复,则将保留该客户,否则将丢失。

Customer  Year   Customer Lost/Retained
A         2009          Retained
A         2010          Retained   
A         2011          Lost
B         2008          Lost
C         2008          Retained
C         2009          lost

可以进一步优化此代码吗?

3 个答案:

答案 0 :(得分:2)

function[P,V,T]=vanderWall(Pressure,T1,T2,nT,V1,V2,nV,varargin)

%Volume and temperature generatd by function
T = T1:nT:T2;
V = V1:nV:V2;

P=Pressure(V,T.')';

%mesh contour plot between T,V and P
meshc(T,V,P)

%plot of V and P (isotherms)
plot(V,P)
end

答案 1 :(得分:1)

您可以将其本身用作merge,但可以修改年份:

In [83]: df['retained'] = pd.notnull(df.merge(
    ...:     df,
    ...:     how="left",
    ...:     left_on=["Customer", "Year"],
    ...:     right_on=["Customer", df["Year"].sub(1)],
    ...:     suffixes=['', "_match"]
    ...: )["Year_match"]).map({True: 'Retained', False: 'Lost'})

In [84]: df
Out[84]:
  Customer  Year Customer Lost/Retained  retained
0        A  2009               Retained  Retained
1        A  2010               Retained  Retained
2        A  2011                   Lost      Lost
3        B  2008                   Lost      Lost
4        C  2008               Retained  Retained
5        C  2009                   lost      Lost

答案 2 :(得分:0)

我们添加一列'Retained'

df['Customer Lost/Retained'] = 'Retained'

除了具有最高每位客户年均收益的指数外,它们还会获得价值'Lost'

mask = df.groupby('Customer')['Year'].idxmax()
df.loc[mask, 'Customer Lost/Retained'] = 'Lost'
  Customer  Year Customer Lost/Retained
0        A  2009               Retained
1        A  2010               Retained
2        A  2011                   Lost
3        B  2008                   Lost
4        C  2008               Retained
5        C  2009                   Lost

或者,也可以先插入'Lost',然后再插入.fillna()

df.loc[df.groupby('Customer')['Year'].idxmax(), 'Customer Lost/Retained'] = 'Lost'
df['Customer Lost/Retained'] = df['Customer Lost/Retained'].fillna('Retained')