Python Homework帮助:计算整数,拆分和返回文本文件中的大多数/最不常见单词的问题

时间:2014-11-29 06:50:58

标签: python string file integer

我的作业存在很多问题,不幸的是,我对这个概念的掌握并不像其他人那样强烈。但是,我编写了大部分代码并且这个想法很明确,但很明显我的语法不正确。根据我的任务,我必须这样做:


学生将编写一个程序:

  1. 接受用户输入要处理的文件的名称。如果指定的文件不存在,则相应 将发生错误处理,并将再次请求文件的名称。这将重复直到 输入有效的文件名或输入字符串“ALL DONE”作为文件名。该 程序将假定命名文件是普通文本文件而不是pickle文件。使用 readline()函数,而不是readlines(),用于读取文件。

  2. 该文件将一次处理一次:

    一个。某些字符将从输入中删除。使用字符串模块, 以下语句将定义要删除的字符。 (string.punctuation + string.whitespace).replace('','')

    湾从输入中删除特定字符后,输入的其余部分 line将在“单词”边界上分割,其中单词与空格分隔 字符('')。

    ℃。处理后的输入的每个“单词”将作为键存储在字典中 value是单词在输入中出现的次数。但是,如果“字” 是一个整数,该单词不会存储在字典中,而是将被求和 这样就可以显示已处理文件中所有整数的总和。

  3. 处理完一个文件后,将显示以下信息:

    一个。文件中所有整数的总和

    湾文件中最常见的5个单词

    ℃。文件中最不常见的5个单词。

    请注意,文件中“最不常见”的字数很可能超过5个。在 在这种情况下,您应该打印任何最常见的5个单词。例如,如果有7个单词 频率为'1',然后列出其中任何5个就足够了,但只列出5。


  4. 所以,我尽我所能编写了我的代码。我的代码是:

    #creates text file
    def create_text():
       with open("hw5speech.txt", "wt") as out_file:
        out_file.write(
            """
        Lincoln's Gettysburg Address
        Given November 19, 1863
        Near Gettysburg, Pennsylvania, USA
    
    
        Four score and seven years ago, our fathers brought forth upon this continent a new nation:     conceived in liberty, and dedicated to the proposition that all men are created equal.
    
        Now we are engaged in a great civil war ... testing whether that
        nation, or any nation so conceived and so dedicated ... can long
        endure. We are met on a great battlefield of that war.
    
        We have come to dedicate a portion of that field as a final resting place for those who here     gave their lives that that nation might live. It is altogether fitting and proper that we should do this.
    
        But, in a larger sense, we cannot dedicate ... we cannot consecrate ... we cannot hallow this     ground. The brave men, living and dead, who struggled here have consecrated it, far above our poor     power to add or detract. The world will little note, nor long remember, what we say here, but it can     never forget what they did here.
    
        It is for us the living, rather, to be dedicated here to the unfinished work which they who   fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us ... that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion ... that we here highly resolve that these dead shall not have died in vain ... that this nation, under God, shall have a new birth of freedom ... and that government of the people ... by the people ... for the people ... shall not perish from the  earth.
          """
         )
        out_file.close()
    
    #user input to read a text file
    def user_input():
        done = False
        while not done:
            file_prompt = input("What file would you like to open? (name is hw5speech.txt) \
            \n(Or, enter ALL DONE to exit) ")
            if file_prompt == "ALL DONE":
                done = True
            else:
                try:
                    text_file = open(file_prompt, "rt")
                    return text_file
                    done = True
                #required exception handling
                except IOError:
                    print("\nThat is not a valid file name. ")
                    print()
    
    #read and modify file
    def read_file():
        import string
        text_file = user_input()
        for line in text_file.readline():
            myList = line.split(string.punctuation + string.whitespace)#.replace('', " ")
            #myList.split('')                                            
    
        #store words and integer count
        int_count = 0
    
        #finds if word is an integer and then adds to count of integers
        def top_integers():
            int_count = 0
            for word in myList:
               test = word.isdigit()
               if test is True:
                   int_count += 1
    
                print("The total of all integers is: ", int_count)
    
        #finds the 5 most common words
        from collections import Counter
        def t5():
            t5count = Counter(myList.split())
            top5 = t5count.most_common(5)
    
            print("The 5 most common words are: ")
            for i in top5:
                print(i[0]) #should only print the word and not the count
    
        #finds the 5 least common words
        def l5():
            l5count = Counter(myList.split())
            least5 = l5count.least_common(5)
    
            print("The 5 least common words are: ")
            for i in least5:
                print(i[0])
    
        #calls the above functions
        top_integers()
        t5()
        l5()
    
    #main function of program
    def final_product():
        create_text()
        read_file()
    
    final_product()
    input("Press Enter to exit.")
    

    因此,当我运行代码时,我输入文件名(hw5speech.txt)。这很好用。然后,它返回     所有整数的总和是:0

    然后一个AttributeError说'list'对象在第73行没有属性'split'.myList是否有范围问题?

    在编程过程中有一点,其中所有内容实际上都没有任何错误。但是,将返回的内容是:

    The total of all integers is: 0
    The 5 most common words are:
    The 5 least common words are:
    
    Press Enter to exit.
    

    所以,假设我修复了错误,我确信我仍然会收到空白错误。世界上我做错了什么?我已经看了很多关于Stack Overflow的主题并使用了不同的方法,但是我得到错误或者不会返回值。我能看到什么才能修复我的代码?

    非常感谢你们!

2 个答案:

答案 0 :(得分:0)

for line in text_file.readline():
        myList = ...

您正在阅读一行。 for循环遍历行中的字符,每次循环都会覆盖myList。拆分单个字符会返回一个包含单个字符的列表。

myList最后是什么,一个字符的列表,文本第一行的最后一个字符。

然后:

    for word in myList:
       test = word.isdigit()

这是运行,但唯一的"字#34;在mylist中是" s"所以它没有数字,并且这样说。

然后:

t5count = Counter(myList.split())

并且您无法拆分列表。 (如果列表是正确的,你可以将它直接传递给Counter)。

您需要使用for line in text_file:遍历文件中的每一行,并将myList作为空列表启动,并使用myList += line.split(..)myList.extend(line.split(...))将其构建为myList。

答案 1 :(得分:0)

由于这是家庭作业而不是回答我会给你一些提示,当你遇到程序问题时,你需要:

  1. 使用调试器逐步完成程序,确保您在每个阶段都拥有所期望的内容(类型和值)。或
  2. 在操作后添加打印语句,以确保您拥有所期望的内容(例如,在阅读文本文件后打印出myList以确保它是一个列表并且全部您期望的行。
  3. 使用任何一种方法,您都可以检查myList 之前的类型,然后再调用它。