请问,我想在数据帧中找到client的最后一个值,我该怎么做?
示例:
array[]=Integer.parseInt(str);
df = pd.DataFrame({'date':
['2018-06-13', '2018-06-14', '2018-06-15', '2018-06-16'],
'gain': [[10, 12, 15],[14, 11, 15],[9, 10, 12], [6, 4, 2]],
'how': [['customer1', 'customer2', 'customer3'],
['customer4','customer5','customer6' ],
['customer7', 'customer8', 'customer9'],
['customer5', 'customer6', 'customer10'] ]}
df :
date gain how
0 2018-06-13 [10, 12, 15] [customer1, customer2, customer3]
1 2018-06-14 [14, 11, 15] [customer4, customer5, customer6]
2 2018-06-15 [9, 10, 12] [customer7, customer8, customer9]
3 2018-06-16 [6, 4, 2] [customer5, customer6, customer10]
非常感谢
答案 0 :(得分:4)
然后使用unnesting函数,drop_duplicates
newdf=unnesting(df,['gain','how']).drop_duplicates('how',keep='last')
newdf
Out[25]:
gain how date
0 10 customer1 2018-06-13
0 12 customer2 2018-06-13
0 15 customer3 2018-06-13
1 14 customer4 2018-06-14
2 9 customer7 2018-06-15
2 10 customer8 2018-06-15
2 12 customer9 2018-06-15
3 6 customer5 2018-06-16
3 4 customer6 2018-06-16
3 2 customer10 2018-06-16
然后使用reindex
l=['customer5','customer6','customer20']
newdf.loc[newdf.how.isin(l)].set_index('how').reindex(l,fill_value='not_find')
Out[34]:
gain date
how
customer5 6 2018-06-16
customer6 4 2018-06-16
customer20 not_find not_find
有关此类问题的解答的有趣读物
How do I unnest a column in a pandas DataFrame?
def unnesting(df, explode):
idx=df.index.repeat(df[explode[0]].str.len())
df1=pd.concat([pd.DataFrame({x:np.concatenate(df[x].values)} )for x in explode],axis=1)
df1.index=idx
return df1.join(df.drop(explode,1),how='left')