限制DataFrame列中的单词数

时间:2019-12-13 11:30:33

标签: python-3.x pandas dataframe split

我的数据框看起来像

      Abc                       XYZ 
0  Hello   How are you doing today
1   Good                 This is a
2    Bye                   See you
3  Books  Read chapter 1 to 5 only

max_size = 3, 我想将列(XYZ)截断为最大3个单词(max_size)。某些行的长度小于max_size,应保持原样。

所需的输出:

     Abc                       XYZ
0  Hello               How are you
1   Good                 This is a
2    Bye                   See you
3  Books            Read chapter 1

1 个答案:

答案 0 :(得分:3)

使用带限制的拆分,删除最后一个值,然后将列表连接在一起:

max_size = 3

df['XYZ'] = df['XYZ'].str.split(n=max_size).str[:max_size].str.join(' ')
print (df)
     Abc             XYZ
0  Hello     How are you
1   Good       This is a
2    Bye         See you
3  Books  Read chapter 1

另一个具有lambda函数的解决方案:

df['XYZ'] = df['XYZ'].apply(lambda x: ' '.join(x.split(maxsplit=max_size)[:max_size]))