我要处理以下数据框, DF
Name City
Hat, Richards Paris
Adams New york
Tim, Mathews Sanfrancisco
chris, Moya De Las Vegas
kate, Moris Atlanta
Grisham HA Middleton
James, Tom, greval Rome
我期望的数据框应为DF
Name Last_name City
Hat Richards Paris
Adams New york
Tim Mathews Sanfrancisco
chris Moya De Las Vegas
kate Moris Atlanta
Grisham HA Middleton
James, Tom greval Rome
应该在最后一个','上进行拆分,如果没有,则所有其他单词或短语应落在'Last_name'列中,并且'Name'列应保持空白。
答案 0 :(得分:4)
将str.split
与n=-1
一起使用(默认情况下,您可以更改所需的内容)
newdf=df.Name.str.split(', ',expand=True,n=1).ffill(1)
newdf.loc[newdf[0]==newdf[1],0]=''
newdf
Out[923]:
0 1
0 Hat Richards
1 Adams
2 Tim Mathews
3 chris MoyaDe
4 kate Moris
5 GrishamHA
df[['Name','LastName']]=newdf
df
Out[925]:
Name City LastName
0 Hat Paris Richards
1 Newyork Adams
2 Tim Sanfrancisco Mathews
3 chris LasVegas MoyaDe
4 kate Atlanta Moris
5 Middleton GrishamHA
答案 1 :(得分:4)
使用str.split
和radd
来添加,
,最后添加str.lstrip
:
df[['first','last']] = df['Name'].radd(', ').str.rsplit(', ', n=1, expand=True)
df['first'] = df['first'].str.lstrip(', ')
print (df)
Name City first last
0 Hat, Richards Paris Hat Richards
1 Adams New york Adams
2 Tim, Mathews Sanfrancisco Tim Mathews
3 chris, Moya De Las Vegas chris Moya De
4 kate, Moris Atlanta kate Moris
5 Grisham HA Middleton Grisham HA
6 James, Tom, greval Rome James, Tom greval
答案 2 :(得分:4)
将pandas.str.split
与str[::-1]
一起使用可以颠倒顺序
df[['Last_name', 'Name']] = df.Name.str.split(', ').str[::-1].apply(pd.Series)
df
Name City Last_name
0 Hat Paris Richards
1 NaN New york Adams
2 Tim Sanfrancisco Mathews
3 chris Las Vegas Moya De
4 kate Atlanta Moris
5 NaN Middleton Grisham HA