EDX设置5 - caesars密码

时间:2013-05-06 18:58:49

标签: python

我正在编写edx问题集5,我在我的代码中偶然发现了一个问题:

# 6.00x Problem Set 5
#
# Part 1 - HAIL CAESAR!

import string
import random

WORDLIST_FILENAME = "words.txt"

# -----------------------------------
# Helper code
# (you don't need to understand this helper code)
def loadWords():
    """
    Returns a list of valid words. Words are strings of lowercase letters.

    Depending on the size of the word list, this function may
    take a while to finish.
    """
    print "Loading word list from file..."
    inFile = open(WORDLIST_FILENAME, 'r')
    wordList = inFile.read().split()
    print "  ", len(wordList), "words loaded."
    return wordList

def isWord(wordList, word):
    """
    Determines if word is a valid word.

    wordList: list of words in the dictionary.
    word: a possible word.
    returns True if word is in wordList.

    Example:
    >>> isWord(wordList, 'bat') returns
    True
    >>> isWord(wordList, 'asdf') returns
    False
    """
    word = word.lower()
    word = word.strip(" !@#$%^&*()-_+={}[]|\\:;'<>?,./\"")
    return word in wordList

def randomWord(wordList):
    """
    Returns a random word.

    wordList: list of words  
    returns: a word from wordList at random
    """
    return random.choice(wordList)

def randomString(wordList, n):
    """
    Returns a string containing n random words from wordList

    wordList: list of words
    returns: a string of random words separated by spaces.
    """
    return " ".join([randomWord(wordList) for _ in range(n)])

def randomScrambled(wordList, n):
    """
    Generates a test string by generating an n-word random string
    and encrypting it with a sequence of random shifts.

    wordList: list of words
    n: number of random words to generate and scamble
    returns: a scrambled string of n random words

    NOTE:
    This function will ONLY work once you have completed your
    implementation of applyShifts!
    """
    s = randomString(wordList, n) + " "
    shifts = [(i, random.randint(0, 25)) for i in range(len(s)) if s[i-1] == ' ']
    return applyShifts(s, shifts)[:-1]

def getStoryString():
    """
    Returns a story in encrypted text.
    """
    return open("story.txt", "r").read()


# (end of helper code)
# -----------------------------------


#
# Problem 1: Encryption
#
def buildCoder(shift):
    """
    Returns a dict that can apply a Caesar cipher to a letter.
    The cipher is defined by the shift value. Ignores non-letter characters
    like punctuation, numbers and spaces.

    shift: 0 <= int < 26
    returns: dict
    """
    dict={}
    upper = string.ascii_uppercase
    lower = string.ascii_lowercase
    for l in range(len(upper)):
        dict[upper[l]] = upper[(l+shift)%len(upper)]
    for l in range(len(lower)):
        dict[lower[l]] = lower[(l+shift)%len(lower)]
    return dict


def applyCoder(text, coder):
    """
    Applies the coder to the text. Returns the encoded text.

    text: string
    coder: dict with mappings of characters to shifted characters
    returns: text after mapping coder chars to original text
    """
    new_text=''
    for l in text:
        if not(l in string.punctuation or l == ' ' or l in str(range(10))):
           new_text += coder[l]
        else:
           new_text += l            
    return new_text   

def applyShift(text, shift):
    """
    Given a text, returns a new text Caesar shifted by the given shift
    offset. Lower case letters should remain lower case, upper case
    letters should remain upper case, and all other punctuation should
    stay as it is.

    text: string to apply the shift to
    shift: amount to shift the text (0 <= int < 26)
    returns: text after being shifted by specified amount.
    """
    ### TODO.
    ### HINT: This is a wrapper function.
    coder=buildCoder(shift)
    return applyCoder(text,coder)

#
# Problem 2: Decryption
#
def findBestShift(wordList, text):
    """
    Finds a shift key that can decrypt the encoded text.

    text: string
    returns: 0 <= int < 26
    """
    ### TODO
    wordsFound=0
    bestShift=0

    for i in range(26):
        currentMatch=0
        encrypted=applyShift(text,i)
        lista=encrypted.split(' ')
        for w in lista: 
            if isWord(wordList,w):
                currentMatch+=1
        if currentMatch>wordsFound:
                currentMatch=wordsFound
                bestShift=i
    return bestShift

def decryptStory():
    """
    Using the methods you created in this problem set,
    decrypt the story given by the function getStoryString().
    Use the functions getStoryString and loadWords to get the
    raw data you need.

    returns: string - story in plain text
    """
    text = getStoryString()
    bestMatch = findBestShift(loadWords(), text)
    return applyShift(text, bestMatch)

#
# Build data structures used for entire session and run encryption
#

if __name__ == '__main__':
    wordList = loadWords()
    decryptStory()

s = 'Pmttw, ewztl!'
print findBestShift(wordList, s)

print decryptStory()

问题在于程序的单个模块与解密故事不同。这段代码有什么问题?

1 个答案:

答案 0 :(得分:1)

您的第一个问题是applyCoder无法按照书面形式运作。

buildCoder构建一个只有字母条目的dict。但applyCoder会尝试查找不是in string.punctuation== ' 'in str(range(10))的任何内容。我认为你想要string.digits(因为str(range(10))'[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]'),但是如果你给它一个新行,它仍然会爆炸,一个名为{{1}的文件几乎可以肯定有。

简单的解决方法是检查story.txt。但是还有一个更好的解决方法:不要试图用一种复杂的方式来反向表达相同的过滤器,或者重复自己,而只是尝试一下:

l in string.ascii_uppercase or l in string.ascii_lowercase

如果for l in text: new_text += coder.get(l, l) 在地图中,则会返回coder[l],如果不是,则返回默认值l


修复后,该函数运行,并成功输出一些东西。但它没有输出正确的东西。为什么?

好吧,看看这个:

l

因此,每当您找到比0的初始if currentMatch>wordsFound: currentMatch=wordsFound bestShift=i 更好的匹配时,您......丢弃wordsFound值并保持currentMatch不变。当然你想要wordsFound,而不是相反,对吗?


解决了这两个问题:

wordsFound = currentMatch

所以,它显然在某处做了一些不必要的重复性工作,但除此之外,它还可以。


学习如何调试这样的问题可能比解决这个问题更重要,所以这里有一些建议。

我通过添加一些额外的$ ln -s /usr/share/dict/words words.txt $ echo -e "This is a test.\n\nIs it good enough? Let's see.\n" | rot13 > story.txt $ python caesar.py Loading word list from file... 235886 words loaded. Loading word list from file... 235886 words loaded. 18 Loading word list from file... 235886 words loaded. This is a test. Here's some text. Is it enough? Let's see. 语句找到了问题。重要的是:

print

你会看到if currentMatch>wordsFound: print i, currentMatch, wordsFound currentMatch=wordsFound bestShift=i 永远不会从0变化。即使在找到18场比赛之后,它也会选择1场比赛最佳。显然,出了点问题。

但我不知道把那个放在哪里。我在这个地方添加了十几条wordsFound行。这是调试简单代码的最简单方法。

对于更复杂的代码,如果打印方式太多,您可能希望写入一个日志文件(理想情况下使用print),您可以在事后解析。或者,更好的是,使用更简单的输入数据,并在调试器和/或交互式可视化工具(如this one)中运行。

或者,更好的是,在发现不起作用的部件之前将其拆下来。例如,如果您知道班次18应该优于班次12,请尝试使用12和18调用logging并查看他们每次返回的内容。

即使这些步骤无法解决问题,他们也会向您提出更好的问题,以便在SO上发布。