Question

试图限制DataFrame输出中显示的字符数。

以下是DataFrame的示例：

     Abc                       XYZ
0  Hello   How are you doing today
1   Good   This is a job well done
2    Bye          See you tomorrow
3  Books  Read chapter 1 to 5 only

所需的输出：

     Abc                       XYZ
0  Hello                   How are 
1   Good                   This is
2    Bye                   See you
3  Books              Read chapter

这是我尝试过的：

pd.set_option('display.max_info_rows', 2)
pd.set_option('display.max_info_columns', 2)
pd.set_option('display.max_colwidth', 2)

max_info_rows和max_info_columns什么也不做，而max_colwidth实际上进一步扩展了字符。

是否要限制数据框中的字符数？

谢谢！

Answer 1

让熊猫只显示每个字符串的两个单词会很棘手。本质上，Python中的字符串实际上并没有单独的“单词”的概念。您可以做的是将每个字符串分成一个字符串列表（每个单词一个字符串），然后使用'display.max_seq_items'选项限制Pandas打印的列表项的数量：

import pandas as pd

d = '''     Abc                       XYZ
0  Hello   "How are you doing today"
1   Good   "This is a job well done"
2    Bye          "See you tomorrow"
3  Books  "Read chapter 1 to 5 only"'''

df = pd.read_csv(pd.compat.StringIO(d), sep='\s+')

# convert the XYZ values from str to list of str
df['XYZ'] = df['XYZ'].str.split()

# only display the first 2 values in each list of word strings
with pd.option_context('display.max_seq_items', 2):
    print(df)

输出：

     Abc                   XYZ
0  Hello       [How, are, ...]
1   Good       [This, is, ...]
2    Bye       [See, you, ...]
3  Books  [Read, chapter, ...]

Answer 2

尝试一下：

df.XYZ.apply(lambda x : x.rsplit(maxsplit=len(x.split())-2)[0])

0         How are
1         This is
2         See you
3    Read chapter

只需重新分配：

df.XYZ = df.XYZ.apply(lambda x : x.rsplit(maxsplit=len(x.split())-2)[0])
print(df)

     Abc           XYZ
0  Hello       How are
1   Good       This is
2    Bye       See you
3  Books  Read chapter

如何使用DataFrame限制一列中的字符串（字符/单词）数量

2 个答案: