Question

我有一个名为＆＃39; Stories＆＃34;看起来像这样：

Story
The Man
The Man Child
The Boy of Egypt
The Legend of Zelda

有没有办法提取每个字符串中的最后一个字？

类似的东西：

Stories['Prefix'] = final['Story'].str.extract(r'([^ ]*)')

找到前缀，但我不确定如何相应地调整它

我希望最终得到像

这样的东西

Story                  Suffix
The Word Of Man         Man
The Man of Legend       Legend
The Boy of Egypt        Egypt
The Legend of Zelda     Zelda

非常感谢任何帮助！

Answer 1

您可以使用.str两次，因为.str[-1]会选择最后一个元素：

>>> df["Suffix"] = df["Story"].str.split().str[-1]
>>> df
                 Story Suffix
0              The Man    Man
1        The Man Child  Child
2     The Boy of Egypt  Egypt
3  The Legend of Zelda  Zelda

Answer 2

我认为分割比正则表达更清晰，但你可以apply选择任何一个系列函数。

final['Prefix'] = final['Story'].apply(lambda x: x.split()[-1])

Answer 3

要获得最后一个单词，您可以创建一个列表，其中每个标题都是列表中的条目，并调用此列表解析来获取所有后缀：

suffixes = [item.split()[-1] for item in mylist]

这会按每个单词拆分字符串，并使用[-1]来获取最后一个条目。

然后你可以按照你想要的方式把它写回来。

以上列表理解等同于：

suffixes = []
for item in mylist:
    suffixes.append(item.split()[-1])) #item.split() to get a list of each word in the string, and [-1] to get the last word

以下是一个例子：

mylist = ['The Man', 'The Man Child', 'The Boy of Egypt', 'The Legend of Zelda']
suffixes = [item.split()[-1] for item in mylist]
print suffixes #['Man', 'Child', 'Egypt', 'Zelda']

Answer 4

不确定是否有任何内置函数可以直接执行此操作。您可以遍历字符串，如

for i in xrange(len(df)):
    df['Suffix'].iat[i] = df['Story'].iat[i].split(' ')[len(df['Story'].iat[i].split(' '))-1]

Answer 5

使用可以使用正则表达式模式提取最后一个单词：

In [10]:

df['suffix'] = df.Story.str.extract(r'((\b\w+)[\.?!\s]*$)')[0]
df
Out[10]:
                  Story  suffix
0               The Man     Man
1         The Man Child   Child
2      The Boy of Egypt   Egypt
3  The Legend of Zeldar  Zeldar

该模式是我在此处找到的答案的修改版本：regex match first and last word or any word

查找列表中的字符串中的最后一个单词（Pandas，Python 3）

5 个答案: