我有一个DF:
DF
camp, value
asd_abcd_gr_yxz_aaaa, 5
efgh_kr_ijk, 10
hjssaasd_kr_adsad, 15
asdas_kr_asd, 2
asd_fr_asda_bb_bbbbbbb, 12
adklasdj_gr_asdsad, 3
还有更长的时间。
与列表[_gr_, _kr_, _fr_, etc..]
中的元素进行比较后,我希望结果为
DF
camp, value
gr, 8
kr, 27
fr, 12
优选尽可能短而不绕过DF。该列表比_gr_, _kr_, _fr_
提前致谢!
答案 0 :(得分:3)
您可以使用str.contains
尝试ix
:
print df
camp value
0 abcd_gr_yxz 5
1 efgh_kr_ijk 10
2 hjssaasd_kr_adsad 15
3 asdas_kr_asd 2
4 asd_fr_asda 12
5 adklasdj_gr_asdsad 3
ABR = ['_gr_', '_kr_', '_fr_']
for x in ABR:
df.ix[df['camp'].str.contains(x), 'camp'] = x
print df
camp value
0 _gr_ 5
1 _kr_ 10
2 _kr_ 15
3 _kr_ 2
4 _fr_ 12
5 _gr_ 3
print df.groupby('camp')['value'].sum().reset_index()
camp value
0 _fr_ 12
1 _gr_ 8
2 _kr_ 27
ABR = ['_gr_', '_kr_', '_fr_']
s = '(' + '|'.join(ABR) + ')'
print s
(_gr_|_kr_|_fr_)
df['camp'] = df['camp'].str.extract(s, expand=False)
df = df.groupby('camp', as_index=False)['value'].sum()
df['camp'] = df['camp'].str.strip('_')
print df
camp value
0 fr 12
1 gr 8
2 kr 27