如何计算依赖于pandas中的句点条件的列

时间:2017-09-01 14:36:58

标签: python pandas

我不确定我应该提出什么标题,但我清楚我想要实现的目标。

我有以下数据框:

period = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
final_renewal_percentage = [0.1, 0.2, 0.3, 0.4, 0.5, 0.5, 0.5, 0.5, 0.5,1]
first_renewals = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
df = pd.DataFrame({'period': period, 'first_renewals': first_renewals, 'final_renewal_percentage': final_renewal_percentage})

我需要计算以下列renewal_of_renewals

0    0.0 # this is 0 since period < 4
1    0.0 # this is 0 since period < 4
2    0.0 # this is 0 since period < 4
3    0.0 # this is 0 since period < 4
4    0.5 # this is 1 * 0.5 (first_renewals corresponding to period=0)
5    1.0 # this is 2 * 0.5 (first_renewals corresponding to period=1)
6    1.5 # this is 3 * 0.5 (first_renewals corresponding to period=2)
7    2.0 # this is 4 * 0.5 (first_renewals corresponding to period=3)
8    2.5 # this is 5 * 0.5 (first_renewals corresponding to period=4)
9    6.0 # this is 6 * 1 (first_renewals corresponding to period=5)
Name: renewals_of_renewals, dtype: float64

基本解释,如果期间是&lt; 4,renewals_of_renewals为0.否则,它是first_renewalsfinal_renewal_percentage的乘积,但first_renewals的值是period - 4的对应值(请参阅有关数据框的详细信息)

我能够通过使用for循环来计算这个计算。但是,我想避免使用for循环,但我不知道如何实现这一点。

2 个答案:

答案 0 :(得分:2)

我只是在整个数据框上进行计算,然后在这里将零设置为你想要的位置:

renewals_of_renewals = np.array(df['first_renewals'])[df['period']-4] * df['final_renewal_percentage']
renewals_of_renewals[np.where(df['period'] < 4)[0]] = 0.0

答案 1 :(得分:1)

您可以根据每行的其他列构建另一个df列,如下所示:

def transform_function(row):
    if row['period'] < 4:
       return float(0)
    elif row['period'] >=4:
       return row['first_renewals'] * row['final_renewal_percentage'] 


df['renewal_of_renewals'] = df.apply(lambda row: transform_function(row),axis = 1)