Question

我有一个数据框，其中的一列包含街道交叉路口

|          Locations           |
--------------------------------
|W Madison Ave & S Randall Blvd|
|N Clemson St & E Tower Ave    |
|E Thompson St & S Garfield Ln |

我想删除方向性字符（N，S，E，W）以及街道的后缀（Blvd，St，Ave等），以使输出看起来像这样

|     Locations     |
---------------------
|Madison & Randall  |
|Clemson & Tower    |
|Thompson & Garfield|

我无法执行str.replace()，因为它将从我需要停留的单词中删除字符。我尝试使用lstrip()和rstrip()，但是那不能解决我想从字符串中间删除的字符。

我还尝试了Series.apply()

banned = ['N', 'S', 'E', 'W', 'Ave', 'Blvd', 'St', 'Ln']
df["Locations"].apply(lambda x: [item for item in x if item not in banned])

但这实际上是一个str.replace()，并将所有内容放置在数据框的列表中。

Answer 1

您很亲密-您可以先拆分值，然后再拆分join：

f = lambda x: ' '.join([item for item in x.split() if item not in banned])
df["Locations"] = df["Locations"].apply(f)

或list comprehension：

df["Locations"] = [' '.join([item for item in x.split() 
                  if item not in banned]) 
                  for x in df["Locations"]]


print (df)
             Locations
0    Madison & Randall
1      Clemson & Tower
2  Thompson & Garfield

Answer 2

也许您提到过使用replace

df.replace(dict(zip(banned,['']*len(banned))),regex=True)
Out[54]: 
                      Locations           
0           Madison  &  Randall 
1            Clemson t &  Tower     
2        Thompson t &  Garfield

Answer 3

作为删除不需要的单词的替代方法，您可以选择选择想要的单词。由于示例行遵循相同的模式，因此您似乎想要选择第二个和第六个单词，并使用它们来命名该位置。看起来像这样：

df['new_location'] = ''

for i,location in enumerate(df.Locations):
        df.new_location.iloc[i] = location.split(' ')[1] +' & ' +location.split(' ')[5]

Answer 4

给出s是以下Series：

0    |          Locations           |
1    --------------------------------
2    |W Madison Ave & S Randall Blvd|
3    |N Clemson St & E Tower Ave    |
4    |E Thompson St & S Garfield Ln |
Name: 0, dtype: object

您可以使用以下正则表达式

s.str.replace('(?:E|W|N|St?|Blvd|Ave|Ln)', '')

获得

0    |          Locations           |
1    --------------------------------
2             | Madison  &  Randall |
3           | Clemson  &  Tower     |
4          | Thompson  &  Garfield  |
Name: 0, dtype: object

从数据框单元格的字符串中删除单词/字符？

4 个答案: