我想根据一个列的条件(关键字)创建几个列。
这是我的DataFrame的代码段
Index wave_path
0 wav48/p225/p225_001.wav
. wav48/p227/p227_005.wav.
5
. ......................
. ......................
44040 wav48/p376/p376_265.wav
现在,我有一个文本文件,根据ID(即225、227、376等),其栏目很少。该文本文件包含以下文本。
ID AGE GENDER ACCENTS REGION
225 23 F English Southern England
226 22 M English Surrey
227 38 M English Cumbria
228 22 F English Southern England
229 23 F English Southern England
230 22 F English Stockton-on-tees
我希望根据与wave_path
列匹配的关键字ID将这5列作为其各自的条目。
from pandas import DataFrame
df.loc[df.wave_path == wav48/p225/p225_001.wav, 'AGE'] = '23'
df.loc[df.wave_path == wav48/p225/p227_005.wav, 'AGE'] = '38'
print (df)
但是,这将导致数百行代码,并且非常耗时。有什么办法可以做到这一点?
期望的结果将是:
Index wave_path AGE GENDER ACCENT REGION
0 wav48/p225/p225_001.wav 23 F English Southern England
. wav48/p227/p227_005.wav. 38 M English Cumbria
5
. ......................
. ......................
44040 wav48/p376/p376_265.wav
答案 0 :(得分:0)
首先拆分wave_path并获取最后的ID
wav48/p225/p225_001.wav
-> 225
将其转换为int
df['ID'] = df['wave_path'].apply(lambda x :x.split("/")[-1].split("_")[0].split("p")[-1])
df['ID'] = df['ID'].astype(int)
df2['ID'] = df2['ID'].astype(int)
final_df = pd.merge(df,df2,on=['ID'],how='left')