样本数据框
CountryName
India|Pakistan
Pakistan|Agansitan
Sweden
Nepal|Bhutan
输出带有新列的数据框
CountryName MainCountry
India|Pakistan India
Pakistan|Agansitan Pakistan
Sweden Sweden
Nepal|Bhutan Nepal
我尝试过
df["MainCountry"] =df['CountryName'].str.contains("[|].*","")
给出正确或错误的信息,您能帮助我了解如何获得该信息
答案 0 :(得分:3)
您可以
In [87]: df['MainCountry'] = df['CountryName'].str.split('|').str[0]
In [88]: df
Out[88]:
CountryName MainCountry
0 India|Pakistan India
1 Pakistan|Agansitan Pakistan
2 Sweden Sweden
3 Nepal|Bhutan Nepal
答案 1 :(得分:3)
使用 str.extract
df.assign(MainCountry=df.CountryName.str.extract(r'(.*?)(?:\||$)'))
CountryName MainCountry
0 India|Pakistan India
1 Pakistan|Agansitan Pakistan
2 Sweden Sweden
3 Nepal|Bhutan Nepal
或 str.partition
df.assign(MainCountry=df.CountryName.str.partition('|')[0])
CountryName MainCountry
0 India|Pakistan India
1 Pakistan|Agansitan Pakistan
2 Sweden Sweden
3 Nepal|Bhutan Nepal
答案 2 :(得分:2)
使用str.split
和str.get
df.CountryName.str.split('|').str.get(0)
答案 3 :(得分:0)
使用Where
df['Main_Country'] = (np.where(df['CountryName'].str.contains('|'),
df['CountryName'].str.split('|').str[0],
df['CountryName']))
输出:
CountryName Main_Country
0 India|Pakistan India
1 Pakistan|Agansitan Pakistan
2 Sweden Sweden
3 Nepal|Bhutan Nepal