Question

我需要获得存储在pandas列中的句子的第二个单词。我可以通过以下行轻松获得第一个单词：

df['First'] = df['Sentence'].astype(str).apply(lambda x: x.split()[0])

那么为什么世界上试图以这种方式获得第二个词的方式失败了：

df['Second'] = df['Sentence'].astype(str).apply(lambda x: x.split()[1])

给我

IndexError: list index out of range

Answer 1

将str.split与str[1]一起使用，如果没有第二个词获得NaN s：

df = pd.DataFrame({'Sentence':['a','a d','s df sd']})
df['Second'] = df['Sentence'].astype(str).str.split().str[1]
print (df)
  Sentence Second
0        a    NaN
1      a d      d
2  s df sd     df

错误说明：

至少有一个句子没有空格，因此在x.split()[1]中选择列表的第二个值会引发错误，因为第二个列表不存在。

从pandas专栏中的字符串中获取第二个单词

1 个答案: