统计非文章词

时间:2018-06-09 18:10:56

标签: python python-3.x

尝试 def countnonarticlewords 来计算mobydick文本文件中的非文章字数。我不想指的是' a''' '。有关如何修改 def countnonarticlewords 以输出示例#1的任何提示。目前正在输出例#2

示例#1

******* mobydick.txt ******* 
Total Lines: 15604
Total Chars: 512293
Total Words: 115314
**Total Non-Article Words: 105479**

示例#2

 Enter the test file name: mobydick.txt
*****mobydick.txt*****
Total Lines: 15604
Total Chars: 512293
Total Words: 115314
**Total Non-Article Words: 15604**  

到目前为止我所拥有的:

def getTextFile():
        filename=input("Enter the test file name: ")
        textFile=open(filename, 'r')
        return filename,textFile

def outputcountresults(filename, linecount, charcount, wordcount, nonarticlewordcount):
    print("*****{}*****".format(filename))
    print("Total Lines: {}".format(linecount))
    print("Total Chars: {}".format(charcount))
    print("Total Words: {}".format(wordcount))
    print("Total Non-Article Words: {}".format(nonarticlewordcount))

def countcharacters(line):
    charcount=0
    for c in line:
        if not c.isspace():
            charcount= charcount +1
    return charcount

def countnonarticlewords(line):
    nonarticlewords= line.split()
    nonarticlewords=0
    for nonarticle in line:
        #nonarticlewords= line.split()
        if not 'a' or 'an' or 'the':
            nonarticlewords= nonarticlewords +1
    return len(nonarticle)

def countwords(line):
    words= line.split()
    return len(words)

def countdocstats(docFile):
    linecount=0
    totalcharacters=0
    totalwords=0
    totalnonarticlewords=0
    for line in docFile:
        linecount= linecount + 1
        totalwords= totalwords + countwords(line)
        totalcharacters=totalcharacters+countcharacters(line)
        totalnonarticlewords= totalnonarticlewords + countnonarticlewords(line)
    return linecount, totalcharacters, totalwords, totalnonarticlewords

def main():
    filename, textFile=getTextFile()
    linecount, totalcharacters, totalwords, totalnonarticlewords= countdocstats(textFile)

    outputcountresults(filename,linecount,totalcharacters,totalwords,totalnonarticlewords)
main()

0 个答案:

没有答案