返回包含在dataframe列每行中的列表中的第一个单词

时间:2017-08-29 19:16:53

标签: python pandas dataframe

我有一个字符串列表,如何在数据帧列的每一行中找到该列表的第一个字符串并将其添加到新列?

这是清单:

Place = ['Abule-Egba', 'Agege', 'Alapere', 'Alimosho', 
         'Ajah', 'Amuwo-Odofin', 'Apapa', 'Bariga', 'Badagry', 
         'Epe', 'Ejigbo', 'Gbagada', 'Iddo-Island', 'Idimu', 'Igando', 
         'Ijora', 'Ikeja', 'isherri','Lekki', 'Ojo'] 

和9784行的dataframe列地址:

0       Eleranigbe Eleranigbe Eleranigbe Ibeju Lekki L...
1             Opebi street opebi street Opebi Ikeja Lagos
2                          VI Lagos VI Extension VI Lagos
3               off afrika lane Lekki Phase 1 Lekki Lagos
4           NEAR IGANDO B/STOP Igando Ikotun Igando Lagos
5       Tijani Salako off Bode Shodiya street Bucknor ...
6       Fatade street, off Isheri/ Ijegun Rd, Kuduyeib...
7       Shodimu street by K& S B/stop, Abaranje Abaran...
8                 Banana island Banana Island Ikoyi Lagos
9               Oral Estate Oral Estate Ikota Lekki Lagos
10                         Ajah Ajah Sangotedo Ajah Lagos
11                Lekki Phase 1 Lekki Phase 1 Lekki Lagos
12       Jakande Pinnock Beach estate Jakande Lekki Lagos
13      opic estate isheri lagos opic Isheri North Ojo...
14                          ELEKO Eleko Ibeju Lekki Lagos
15                            chevron Chevron Lekki Lagos

我试图像这样创建一个新列:

                         1                                           2
0       Eleranigbe Eleranigbe Eleranigbe Ibeju Lekki L...          Lekki
1             Opebi street opebi street Opebi Ikeja Lagos          Ikeja
2                          VI Lagos VI Extension VI Lagos            VI
3               off afrika lane Lekki Phase 1 Lekki Lagos          Lekki
4           NEAR IGANDO B/STOP Igando Ikotun Igando Lagos          Igando
5       Tijani Salako off Bode Shodiya street Bucknor ...          Ikoyi
6       Fatade street, off Isheri/ Ijegun Rd, Kuduyeib...          Isheri

继承我的代码,但我得到了一个错误; ValueError:值的长度与索引的长度不匹配

s['where'] =''
de = []
for i in s['Address]:
    for j in Place:
        if j in i:
            de.append(j)

            break;

我认为我的代码是错的,但我可以为我的生活弄明白。

1 个答案:

答案 0 :(得分:0)

我认为应该可以使用df.str.extract

df['Places'] = df.iloc[:, 0].str.extract('(' + '|'.join(Place) + ')', expand=False)
df.head()
                                                   1  Places
0  Eleranigbe Eleranigbe Eleranigbe Ibeju Lekki L...   Lekki
1        Opebi street opebi street Opebi Ikeja Lagos   Ikeja
2                     VI Lagos VI Extension VI Lagos     NaN
3          off afrika lane Lekki Phase 1 Lekki Lagos   Lekki
4      NEAR IGANDO B/STOP Igando Ikotun Igando Lagos  Igando