Question

我的数据框有一个名为Borough的列，其中包含以下值：

“东部多伦多”，“西部多伦多”，“中央多伦多”和“西部多伦多”，以及其他地区名称。

现在，我需要一个正则表达式，该表达式可为我获取以“ toronto”结尾的每个条目的数据。我该怎么办？

我尝试过：

tronto_data = df_toronto[df_toronto['Borough'] = .*Toronto$].reset_index(drop=True)
tronto_data.head(7)

Answer 1

如果数据格式正确，则可以在空格上拆分字符串并访问最后一个单词，将其与Toronto进行比较。例如

df = pd.DataFrame({'column': ['west toronto', 'central toronto', 'some place']})

mask_df = df['column'].str.split(' ', expand=True)

返回：

     0         1
0   west     toronto
1   central  toronto
2   some     place

然后，您可以访问最后一列，以得出以多伦多结尾的行。

toronto_df = df[mask_df[1]=='toronto']

编辑：

不知道有一个字符串方法.endswith，这是执行此操作的更好方法。但是，此解决方案确实提供了两个可能有用的列。

Answer 2

就像@ Code_10一样，在注释中引用您可以使用string.endswith ..尝试以下->

df = pd.DataFrame({'city': ['east toronto', 'west toronto', 'other', 'central toronto']})
df_toronto = df[df['city'].str.endswith('toronto')]
#df_toronto.head()

如何获得以特定单词结尾的不同字符串的所有值

2 个答案: