Question

我有一个带有一列文字的数据框-

df['col1']

0         Anton burt 
1         fred foe hip
2         mark helm schuffer Leib

我需要一个新列“ col2”，其中包含“ col1”中所有单词的首字母。我想要的是-

col1                      col2 

Anton burt                A b
fred foe hip              f f h
mark helm schuffer Leib   m h s L

我怎么能得到这个？

Answer 1

使用Series.apply并按空格分隔，先求值并结合在一起：

df['col2'] = df['col1'].apply(lambda x: ' '.join(y[0] for y in x.split()))
#alternative
#df['col2'] = [' '.join(y[0] for y in x.split()) for x in df['col1']]
print (df)
                      col1     col2
0               Anton burt      A b
1             fred foe hip    f f h
2  mark helm schuffer Leib  m h s L

Answer 2

或者您也可以使用正则表达式(\b[a-zA-Z])处理series.str.findall()和s.str.join()，以查找每个单词的第一个字母：

df['col2']=df.col1.str.findall(r'(\b[a-zA-Z])').str.join(' ')
#or df=df.assign(col2=df.col1.str.findall(r'(\b[a-zA-Z])').str.join(' '))

                      col1     col2
0               Anton burt      A b
1             fred foe hip    f f h
2  mark helm schuffer Leib  m h s L

如何获得一串单词的第一个字母

2 个答案: