数据框具有
段| percent_change
segment1 | 25%
segment2 | 30%
segment3 | 40%
我需要为前3名创建句子
“细分3的最高百分比变化为40%”
“细分2的第二个最高百分比变化为30%”
“细分1的百分比变化排名第三,为25%” “第1部分的更改比第2部分多5%” “第2部分的更改比第3部分多10%”
所有这些句子将作为每个单元格值添加到新数据帧中。 谢谢你的帮助!
答案 0 :(得分:0)
使用:
#converted column with percentage to numeric
df['num'] = df['percentage_change'].str.rstrip('%').astype(float)
#get 3top rows by numeric column
df1 = df.nlargest(3, 'num')
#create difference column converted to strings
df1['diff'] = df1['num'].diff(-1).fillna(0).astype(str).str.replace('\.[0]*','') + '%'
#shifting segment column
df1['diff_seg'] = df1['segment'].shift(-1)
#default index strating by 1
df1 = df1.reset_index(drop=True)
df1.index = df1.index + 1
print (df1)
segment percentage_change num diff diff_seg
1 segment3 40% 40.0 10% segment2
2 segment2 30% 30.0 5% segment1
3 segment1 25% 25.0 0% NaN
然后使用f-string
s格式化新列:
f1 = lambda x: f'{x["segment"].title()} has {x.name}. highest percentage change of {x["percentage_change"]}'
f2 = lambda x: f'{x["diff_seg"].title()} has {x["diff"]} more change than {x["segment"]}'
df1['out'] = df1.apply(f1, axis=1)
df1['out1'] = df1.iloc[:-1].apply(f2, axis=1)
print (df1)
segment percentage_change num diff diff_seg \
1 segment3 40% 40.0 10% segment2
2 segment2 30% 30.0 5% segment1
3 segment1 25% 25.0 0% NaN
out \
1 Segment3 has 1. highest percentage change of 40%
2 Segment2 has 2. highest percentage change of 30%
3 Segment1 has 3. highest percentage change of 25%
out1
1 Segment2 has 10% more change than segment3
2 Segment1 has 5% more change than segment2
3 NaN