Question

我正在使用pandas lineSer.value_counts（）制作频率表，但它不会显示我的所有项目。我有超过100条数据，我需要看到所有这些数据

recv

这是我正在使用的代码，我需要在结果中删除'...'并查看所有数据点。我能做些什么不同？

def freqTable():
    fileIn = open('data.txt','r')
    fileOut = open('dataOut.txt', 'w')
    lines = [line.strip() for line in fileIn if line.strip() and not line.startswith('com')
    lineSer = pd.Series(lines)
    freq = str(lineSer.value_counts())
    for line in freq:
        fileOut.write(line)

Answer 1

如果要将列表写入文件，请不要将其写入字符串并将其写入文件。 Pandas具有用于将文件写入文件的内置函数。只做lineSer.value_counts().to_csv('dataOut.txt')。如果要调整输出的格式，请阅读to_csv的文档以了解如何自定义它。（您也可以使用pandas.read_csv之类的内容更有效地阅读您的数据，但这是另一个主题。）

Answer 2

如果您需要临时展示数据，请使用display.max_rows：

尝试option_context

#temporary print 999 rows
with pd.option_context('display.max_rows', 999):
    print freq

docs中的更多信息。

我尝试使用函数strip和startswith修改您的解决方案，以处理字符串数据，并to_csv将输出写入file：

import pandas as pd
import io

temp=u"""Madding.
 Madding.
  Madding.
 Madding.
 Crowning.
   Crowning.
 com Crowning.
com My. 
  com And.
   Thy.
Thou.
The."""
#after testing replace io.StringIO(temp) to data.txt
s = pd.read_csv(io.StringIO(temp), sep="|", squeeze=True)
print s
0           Madding.
1           Madding.
2           Madding.
3          Crowning.
4          Crowning.
5      com Crowning.
6           com My. 
7           com And.
8               Thy.
9              Thou.
10              The.
Name: Madding., dtype: object

#strip data
s = s.str.strip()

#get data which starts with 'com'
print s.str.startswith('com')
0     False
1     False
2     False
3     False
4     False
5      True
6      True
7      True
8     False
9     False
10    False
Name: Madding., dtype: bool

#filter rows, which not starts width 'com'
s = s[~s.str.startswith('com')]
print s
0      Madding.
1      Madding.
2      Madding.
3     Crowning.
4     Crowning.
8          Thy.
9         Thou.
10         The.
Name: Madding., dtype: object

#count freq
freq = s.value_counts()

#temporary print 999 rows
with pd.option_context('display.max_rows', 999):
    print freq 
Madding.     3
Crowning.    2
Thou.        1
Thy.         1
The.         1
Name: Madding., dtype: int64

#write series to file by to_csv
freq.to_csv('dataOut.txt', sep=';')

Answer 3

试试这个：

pd.options.display.max_rows = 999

使用pandas的python中的频率表

3 个答案: