如何在文本文档字计数器程序中不计算某些字符串

时间:2018-04-11 01:55:09

标签: python

对于我的程序,我成功地设法使它计算单词,但它也将文档中的“ - ”计为一个单词,我希望我的程序不计算引号中的单词。当它应该是272时,我会以277个字出现。

infile = open("Gettysburg.txt", "r")
data = infile.readlines()
nwords = 0
lines = 0
nchars = 0
for line in data:
    words = line.split()
    lines += 1
    nwords += len(words)
    nchars += len(line)
print("Jake's word calculator.")
print('The number of words is', nwords)

2 个答案:

答案 0 :(得分:1)

您可以使用列表理解:

nwords += len([i for i in words if i != "--"])

这将从words列表构建一个新列表。只有不等于“ - ”的单词才会进入此新列表。然后使用len()获取新列表的长度。

这是另一种方法:

nwords += len(words) - words.count("--")

答案 1 :(得分:0)

infile = open("Gettysburg.txt", "r")
data = infile.readlines()
nwords = 0
lines = 0
nchars = 0
for line in data:
    line += 1
    wordswithlines = line.split()
    for i in wordswithlines:
        if i == "--":
           print("-- found") #if you are using python 2.7 use print "-- found"
        else:
           words=words+i
    nwords += len(words)
    nchars += len(line)
print("Jake's word calculator.")
print("The number of words is" nwords)