列表索引超出范围python解压缩文本

时间:2017-03-31 22:27:31

标签: python list syntax

我目前拥有的代码如下所示。它首先要做的是要求用户输入一个句子。程序然后找到句子中每个单词的位置,并将单词分成列表以获得单个单词。程序然后摆脱任何重复的单词,使列表中的单词独特。程序然后继续(使用儿子)保存句子中单词的位置(例如1,2,3,4,1,1,2,3,5)和单独的单词(用户可以命名) )。程序的下一部分尝试从单独的文件中解压缩唯一文本,并尝试从句子中的单词位置和唯一单词重新创建原始句子。我知道这个阶段是有效的,因为我已经单独测试了它。但是,当我现在运行该程序时,我不断收到此错误消息:

文件“/Users/Sid/Desktop/Task3New.py”,第70行,在OutputDecompressed中         decompression.append(orgwords [I])     IndexError:列表索引超出范围

我不知道为什么这不起作用,有人在乎帮忙吗?所有帮助表示感谢,谢谢。

import json
import os.path

def InputSentence():
    global sentence
    global words
    sentence = input("Enter a sentence: ")
    words = sentence.split(' ')

def Validation():
    if sentence == (""):
        print ("No sentence was inputted. \nPlease input a sentence...")
        Error()

def Uniquewords():
    print ("Words in the sentence: " + str(words))
    for i in range(len(words)):
        if words[i] not in unilist:
            unilist.append(words[i])
    print ("Unique words: " + str(unilist))

def PosText():
    global find
    global pos
    find = dict((sentence, words.index(sentence)+1) for sentence in          list(words))
    pos = (list(map(lambda sentence: find [sentence], words)))
    return (pos)

def OutputText():
    print ("The positions of the word(s) in the sentence are: " + str(pos))

def SaveFile():
    filename = input("We are now going to save the contents of this program        into a new file. \nWhat would you like to call the new file? ")
    newfile = open((filename)+'.txt', 'w')
    json.dump([unilist, pos], newfile)
    newfile.close


def InputFile():
    global compfilename
    compfilename = input("Please enter an existing compressed file to be  decompressed: ")

def Validation2():
    if compfilename == (""):
        print ("Nothing was entered for the filename. Please re-enter a  valid filename.")
        Error()
    if os.path.exists(filename + ".txt") == False:
        print ("No such file exists. Please enter a valid existing file.")
        Error()

def OutputDecompressed():
    newfile = open((compfilename)+'.txt', 'r')
    saveddata = json.load(newfile)
    orgpos = saveddata[1]
    orgwords = saveddata[0]
    print ("Unique words in the original sentence: " + str(orgwords) +  "\nPosition of words in the sentence: " + str(orgpos))
    decompression = []
    prev = orgpos[0]
    x=0
    #decomposing the index locations
    for cur in range(1,len(orgpos)):
        if (prev == orgpos[cur]): x+= 1
        else:
            orgpos[cur]-=x
            x=0  
        prev = orgpos[cur]
    #Getting the output
    for i in orgpos:
        decompression.append(orgwords[i-1])
    finalsentence = (' '.join(decompression))
    print ("Original sentence from file: " + finalsentence)


def Error():
    MainCompression()


def MainCompression():
    global unilist
    unilist = []
    InputSentence()
    Uniquewords()
    PosText()
    OutputText()
    SaveFile()
    InputFile()
    Validation()
    OutputDecompressed()

MainCompression()

1 个答案:

答案 0 :(得分:0)

问题在于您使用words索引作为unilist / orgwords的索引。

让我们来看看问题:

def PosText():
    global find
    global pos
    find = dict((sentence, words.index(sentence)+1) for sentence in          list(words))
    pos = (list(map(lambda sentence: find [sentence], words)))
    return (pos)

此处find将每个字映射到列表words中的位置。 (BTW为什么迭代words的变量称为sentence?)然后,对于每个单词,该位置都存储在一个新列表中。此过程可以用一行表示:pos = [words.index(word)+1 for word in words]

现在看OutputDecompressed时,您会看到:

for i in orgpos:
    decompression.append(orgwords[i-1])

此处orgposposorgwords为唯一字词列表。现在每个存储的索引都用于取回原始单词,但这是有缺陷的,因为orgpos包含words的索引,即使它们用于访问orgwords

此问题的解决方案是重写PosText和部分OutputDecompressed

def PosText():
    global pos
    pos = [unilist.index(word)+1 for word in words]
    return pos

def OutputDecompressed():
    newfile = open((compfilename)+'.txt', 'r')
    saveddata = json.load(newfile)
    orgpos = saveddata[1]
    orgwords = saveddata[0]
    print ("Unique words in the original sentence: " + str(orgwords) +  "\nPosition of words in the sentence: " + str(orgpos))
    decompression = []
    # I could not figure out what this middle part was doing, so I left it out
    for i in orgpos:
        decompression.append(orgwords[i-1])
    finalsentence = (' '.join(decompression))
    print ("Original sentence from file: " + finalsentence)

对您的代码的一些评论:

  1. 应调用InputSentence() Validation()验证后
  2. InputFile()后,您必须致电Validation2()而不是Validation()
  3. Validation2()中,它应该是compfilename而不是filename
  4. 您应该使用参数而不是全局变量。这使得函数应该更清楚。例如,Uniquewords可以接受单词列表并返回唯一单词列表。它还通过逐个测试每个功能使程序更容易调试,这是目前无法实现的。
  5. 为了让其他Python程序员更容易阅读您的代码,您可以使用PEP 8中指定的Python编码样式