Question

我在我大学的一位教授的办公室工作，他指派我阅读整篇班级论文，试图捕捉剽窃的人，所以我决定用python编写一个程序来查看所有的所有论文中的六个单词短语，并将它们进行比较，以查看是否有任何论文有超过200个匹配短语。六个单词短语将是例如......

我吃了一个土豆，很好。将是：

我吃了一个土豆，它

吃了一个土豆，它是

马铃薯很好吃。

我的代码目前正在

import re
import glob
import os

def ReadFile(Filename):
    try:
        F = open(Filename)
        F2=F.read()
    except IOError:
        print("Can't open file:",Filename)
        return []
    F3=re.sub("[^a-z ]","",F2.lower())
    return F3
def listEm(BigString):
    list1=[]
    list1.extend(BigString.split(' '))
    return list1

Name = input ('Name of folder? ')
Name2=[]
Name3=os.chdir("Documents")
for file in glob.glob("*txt"):
    Name2.append(file)

for file in Name2:
    index1=0
    index2=6
    new_list=[]
    Words = ReadFile(file)
    Words2= listEm(Words)
    while index2 <= len(Words2):
        new_list.append(Words2[index1:index2])
        index1 += 1
        index2 += 1

    del Name2[0]  ##Deletes first file from list of files so program wont compare the same file to itself.

    for file2 in Name2:
        index=0
        index1=6
        new_list2=[]
        Words1= ReadFile(file2)
        Words3= listEm(Words)
        while index1 <= len(Words3):
            new_list2.append(Words3[index:index1])  ##memory error
            index+=1
            index2+=1
    results=[]
    for element in new_list:
        if element in new_list2:
            results.append(element)
    if len(results) >= 200:
        print("You may want to examine the following files:",file1,"and",file2)

我在

上收到内存错误

new_list2.append(Words3[index:index1])

出于某种原因，我无法弄清楚我做错了什么，在我短暂的一个学期的编程生涯中，我从来没有收到过内存错误。感谢您的帮助。

Answer 1

您可能希望在index1内增加index2而不是while并出错。将index2+=1更改为index1+=1。

目前您处于无限循环中，因为index1 <= len(Words3)始终为真，因为您不更改index1，并且在您耗尽内存之前附加到new_list2。

这个错误的道德应该是使用更好的变量名，而不仅仅是在现有数字的末尾附加数字。像你这样的错误输入的可能性将以这种方式降低。

收到内存错误？

1 个答案: