使用数据框值创建字符串

时间:2020-10-08 04:49:56

标签: python python-3.x pandas string dataframe

数据框具有

段| percent_change

segment1 | 25%

segment2 | 30%

segment3 | 40%

我需要为前3名创建句子

“细分3的最高百分比变化为40%”

“细分2的第二个最高百分比变化为30%”

“细分1的百分比变化排名第三,为25%” “第1部分的更改比第2部分多5%” “第2部分的更改比第3部分多10%”

所有这些句子将作为每个单元格值添加到新数据帧中。 谢谢你的帮助!

1 个答案:

答案 0 :(得分:0)

使用:

#converted column with percentage to numeric
df['num'] = df['percentage_change'].str.rstrip('%').astype(float)
#get 3top rows by numeric column
df1 = df.nlargest(3, 'num')
#create difference column converted to strings
df1['diff'] = df1['num'].diff(-1).fillna(0).astype(str).str.replace('\.[0]*','') + '%'
#shifting segment column
df1['diff_seg'] = df1['segment'].shift(-1)
#default index strating by 1
df1 = df1.reset_index(drop=True)
df1.index = df1.index + 1
print (df1)
    segment percentage_change   num diff  diff_seg
1  segment3               40%  40.0  10%  segment2
2  segment2               30%  30.0   5%  segment1
3  segment1               25%  25.0   0%       NaN

然后使用f-string s格式化新列:

f1 = lambda x: f'{x["segment"].title()} has {x.name}. highest percentage change of {x["percentage_change"]}'
f2 = lambda x: f'{x["diff_seg"].title()} has {x["diff"]} more change than {x["segment"]}'
df1['out'] = df1.apply(f1, axis=1)
df1['out1'] = df1.iloc[:-1].apply(f2, axis=1)
print (df1)
    segment percentage_change   num diff  diff_seg  \
1  segment3               40%  40.0  10%  segment2   
2  segment2               30%  30.0   5%  segment1   
3  segment1               25%  25.0   0%       NaN   

                                                out  \
1  Segment3 has 1. highest percentage change of 40%   
2  Segment2 has 2. highest percentage change of 30%   
3  Segment1 has 3. highest percentage change of 25%   

                                         out1  
1  Segment2 has 10% more change than segment3  
2   Segment1 has 5% more change than segment2  
3                                         NaN