我正在开发一个程序,我需要在文本文件中计算每个标记(字母,数字,符号等),但是当我尝试在整个文件上使用len函数时,它会显示TypeError:类型的对象' _io.TextIOWrapper'没有len()
我的问题基本上是如何计算文本文件中的每个标记
def getCountry(Filename):
o = open((Filename),'r')
return o
def menu(Option,Entry):
if Option == 'A':
J = len(Entry)
return J
if Option == 'B':
num = 0
for line in Entry:
found = sum(line.count(xx) for xx in ('and','del','from','not','while','as','elif', 'global','or','with','assert','else', 'if','pass','yield','break','except','import','print','class','exec','in','rise','continue','finally','is', 'return', 'def', 'for', 'lambda', 'try'))
num = line.split()
num2 = len(num)
Per = ("%.2f" % (found/num2*100))
Convert = (Per,"Percent")
return Convert
if Option == 'C':
num_chars = 0
for line in Entry:
found = sum(line.count(xx)for xx in ('+','-','*','/','.','!','@','#','$','%','^','&','(',')','=','?'))
num_chars= len(line)
Per = found/num_chars
return Per
if Option == 'E':
for line in Entry:
Space = line.count(' ')
return Space
def main():
Filename = input('Input filename: ')
Entry = getCountry(Filename)
print('Now from the options below choose a letter )# list of choices')
print('A)Counts the number of tokens (word, symbols, numbers)')
print('B)Counts the number of selected Python key word (e.g. if, while, …)')
print('and returns the % of tokens that are key)')
print('C)Counts the number of selected programming symbols (e.g. +, : , …) and returns the % of tokens that are symbols')
print('D)Receives the metrics for a program and returns a score that compares the two programs metrics')
print('E) Counts all of the white spaces in program')
Option = input('Enter Letter for option: ')
while Option not in ('A', 'B', 'C','D','E'):#input validation statement
Option = str(input('Enter Capital Letter: '))
Answer2 =menu(Option,Entry)
print(Answer2)
main()
答案 0 :(得分:0)
您需要使用 open() - 和 read() -functions。
例如:
# Your file is called "ex.txt"
openit = open("ex.txt", "r") # open() with "r" means to get "read" privileges.
readit = read(openit)
print len(readit)
我猜这个'会给你你想要的结果,虽然我不知道len()函数是否适用于所有类型的字符(如“À”和“ö”)等)
答案 1 :(得分:0)
要计算文件中的所有Python令牌,一种简洁的方法是:
import io
import tokenizer
def count_tokens(filename):
i = 0
with io.open(filename, 'rb') as f:
for i, t in enumerate(tokenizer.tokenize(f.readline), 1):
pass
return i
您似乎希望在您的计划中做更多不同的事情,但这是您在Q的主题中提出的问题。
tokenizer.tokenize
生成器从令牌类型开始产生5元组,后跟令牌字符串,两对给出令牌开始和结束的坐标(行,列),最后是逻辑行令牌发生的地方(它是一个名为元组的,所以你也可以访问带有符号名称的5个项目。)
您的其他一些任务可能会从中受益,只需要检查每个令牌 - 可以通过filter
(或在Python 2.7 itertools.ifilter
中)使用谓词过滤令牌生成器。
然而,StackOverflow的最佳做法是每个问题一个问题",所以我鼓励你根据这些提示进行工作,如果某些特定的东西(除了计算所有代币之外)绊倒你,打开一个单独的问题 - 显示您拥有的代码,您对它的期望,发生了什么(在一个小示例文件中)。