如何使用python计算季度明智的流失率和保留率

时间:2018-03-19 04:40:31

标签: python

如何使用python计算日期列的季度流失率和保留率。我希望使用python将季度分组。

这用于按季度计算客户流失计数组

quarterly_churn_yes = out.loc[out['Churn'] == 'Yes'].groupby(out["Date"].dt.quarter).count()
print(quarterly_churn_yes["Churn"])

Date
1    1154
2     114
3      68
4      69
Name: Churn, dtype: int64

这用于按季度计算流失率组

total_churn = out['Churn'].count()
print(total_churn) 

quarterly_churn_rate = out.groupby(out["Date"].dt.quarter).apply(lambda x: quarterly_churn_yes["Churn"] / total_churn).sum()
print(quarterly_churn_rate)

Date
1    0.862159
2    0.085170
3    0.050803
4    0.051550
dtype: float64

上面的代码我试图找到明确列出日期列的流失率。我得到1,2,3,4,但我希望每年明智的季度流失率。

例如,如果我在数据框中有四年如2018,2014,2017那么

2008

1    1154
2     114
3      68
4      69

2014

1    1154
2     114
3      68
4      69

1 个答案:

答案 0 :(得分:1)

我认为需要:

df = (out.loc[out['Churn'] == 'Yes']
         .groupby([out["Date"].dt.year,out["Date"].dt.quarter])["Churn"]
         .count()
         .rename_axis(('year','quarter'))
         .reset_index(name='count'))

print(df)
   year  quarter  count
0  2015        1      1
1  2015        2      2
2  2015        3      1
3  2015        4      2
4  2016        1      2
dictionary of DataFrames

对于单独的DataFrame,可以创建dfs = dict(tuple(out.groupby(out['Date'].dt.year))) print (dfs) {2016: Churn Date 6 Yes 2016-01-01 7 Yes 2016-02-01, 2015: Churn Date 0 Yes 2015-01-01 1 Yes 2015-05-01 2 Yes 2015-07-01 3 Yes 2015-10-01 4 Yes 2015-04-01 5 Yes 2015-12-01 8 No 2015-05-01 9 No 2015-10-01} print (dfs.keys()) dict_keys([2016, 2015]) print (dfs[2015]) Churn Date 0 Yes 2015-01-01 1 Yes 2015-05-01 2 Yes 2015-07-01 3 Yes 2015-10-01 4 Yes 2015-04-01 5 Yes 2015-12-01 8 No 2015-05-01 9 No 2015-10-01 Tenure column looks like this out["tenure"].unique() Out[14]: array([ 8, 15, 32, 9, 48, 58, 10, 29, 1, 66, 24, 68, 4, 53, 6, 20, 52, 49, 71, 2, 65, 67, 27, 18, 47, 45, 43, 59, 13, 17, 72, 61, 34, 11, 35, 69, 63, 30, 19, 39, 3, 46, 54, 36, 12, 41, 50, 40, 28, 44, 51, 33, 21, 70, 23, 16, 56, 14, 62, 7, 25, 31, 60, 5, 42, 22, 37, 64, 57, 38, 26, 55])

like 1 to 18 --> 1 range
     19 to 36 --> 2nd range
     37 to 54 --> 3rd range like that

它不包含几个月,似乎是1到72。

我需要将权属列拆分为"范围"。

例如,此列包含1到72个数字,最多需要4个范围。

quarterly_churn_yes = out.loc[out['Churn'] == 'Yes'].groupby([out["Date"].dt.year,out["Date"].dt.quarter]).count().rename_axis(('year','quarter'))
quarterly_churn_yes["Churn"]

quarterly_churn_rate = out.groupby(out["Date"].dt.quarter).apply(lambda x: quarterly_churn_yes["Churn"] / total_churn).sum()
print(quarterly_churn_rate)

在这里我发现了季度流失计数,随后流失计数我发现流失率与流失计数和总计数。

declare @FundXML XML
set @XML='<Transaction>
            <TransactionID>58265226</TransactionID>
            <SettlementCurrency>USD</SettlementCurrency>
            <SettlementAmount>
              <Amount ccy="EUR" isFundCcy="true">-1603375.03</Amount>
              <Amount ccy="USD">-1890218.82</Amount>
            </SettlementAmount>
          </Transaction>'

像这样我需要找到任期明智的4范围以找到流失计数。