计算wordfile

时间:2015-11-15 23:11:13

标签: python text file-io

我正在开发一个程序,我需要在文本文件中计算每个标记(字母,数字,符号等),但是当我尝试在整个文件上使用len函数时,它会显示TypeError:类型的对象' _io.TextIOWrapper'没有len()

我的问题基本上是如何计算文本文件中的每个标记

def getCountry(Filename):
    o = open((Filename),'r')

    return o
def menu(Option,Entry):

    if Option == 'A':


        J = len(Entry)

        return J

    if Option == 'B':
        num = 0
        for line in Entry:
            found = sum(line.count(xx) for xx in ('and','del','from','not','while','as','elif', 'global','or','with','assert','else', 'if','pass','yield','break','except','import','print','class','exec','in','rise','continue','finally','is', 'return', 'def', 'for', 'lambda', 'try'))
            num = line.split()
            num2 = len(num)


        Per = ("%.2f" % (found/num2*100))
        Convert = (Per,"Percent")
        return Convert
    if Option == 'C':
        num_chars = 0
        for line in Entry:
            found = sum(line.count(xx)for xx in ('+','-','*','/','.','!','@','#','$','%','^','&','(',')','=','?'))
            num_chars= len(line)
        Per = found/num_chars
        return Per
    if Option == 'E':
        for line in Entry:
            Space = line.count(' ')
        return Space



def main():
    Filename = input('Input filename: ')
    Entry = getCountry(Filename)

    print('Now from the options below choose a letter )# list of choices')
    print('A)Counts the number of tokens (word, symbols, numbers)')
    print('B)Counts the number of selected Python key word (e.g. if, while, …)')
    print('and returns the % of tokens that are key)')
    print('C)Counts the number of selected programming symbols (e.g. +, : , …) and returns the % of tokens that are symbols')
    print('D)Receives the metrics for a program and returns a score that compares the two programs metrics')
    print('E) Counts all of the white spaces in program')
    Option = input('Enter Letter for option: ')
    while Option not in ('A', 'B', 'C','D','E'):#input validation statement
        Option = str(input('Enter Capital Letter: '))
    Answer2 =menu(Option,Entry)
     print(Answer2)

main()

2 个答案:

答案 0 :(得分:0)

您需要使用 open() - 和 read() -functions。

例如:

# Your file is called "ex.txt"
openit = open("ex.txt", "r") # open() with "r" means to get "read" privileges.
readit = read(openit)
print len(readit)

我猜这个'会给你你想要的结果,虽然我不知道len()函数是否适用于所有类型的字符(如“À”和“ö”)等)

答案 1 :(得分:0)

要计算文件中的所有Python令牌,一种简洁的方法是:

import io
import tokenizer

def count_tokens(filename):
    i = 0
    with io.open(filename, 'rb') as f:
        for i, t in enumerate(tokenizer.tokenize(f.readline), 1):
            pass
    return i

您似乎希望在您的计划中做更多不同的事情,但这是您在Q的主题中提出的问题。

tokenizer.tokenize生成器从令牌类型开始产生5元组,后跟令牌字符串,两对给出令牌开始和结束的坐标(行,列),最后是逻辑行令牌发生的地方(它是一个名为元组的,所以你也可以访问带有符号名称的5个项目。)

您的其他一些任务可能会从中受益,只需要检查每个令牌 - 可以通过filter(或在Python 2.7 itertools.ifilter中)使用谓词过滤令牌生成器。

然而,StackOverflow的最佳做法是每个问题一个问题",所以我鼓励你根据这些提示进行工作,如果某些特定的东西(除了计算所有代币之外)绊倒你,打开一个单独的问题 - 显示您拥有的代码,您对它的期望,发生了什么(在一个小示例文件中)。