Question

我有一个用单字母替换加密的字符串，我想尝试使用英语的频率分析来破解它（不是为了解决它，而是为了丰富我的编程技巧）。

我目前处于这样一种情况：我在tuple s的排序列表中表示的字符串中出现字母出现频率，如下所示：[('V', freqV), ('D', freqD)...]（请注意，在这种情况下，V显示的字母比任何其他字母都要多，因此freqV是列表中tuple中出现的最大数字），以及英语＆＃ 39; s以同样的方式表示。

从这个状态，我该如何正确替换字母？

我已经尝试过简单的正面解决方案：

new_text = str(cipher_str)

for i in xrange(26): #26 is the length of both lists, obviously
    new_text = new_text.replace(sorted_cipher_freq[i][0], sorted_eng_freq[i][0])

但它不会起作用（其中一个原因是因为有时替换的字符与解密的字符相同。例如ap = an，所以字母{解密和加密时{1}}相同，但a应为p。）

我该如何解决这个问题？

Answer 1

你会接受并使用最高频率并将其与最高N频率相匹配......就像这样

en_freq="ETAOINSHDLUCMFYWGPBVKXQJZ" #from http://www.math.cornell.edu/~mec/2003-2004/cryptography/subs/frequencies.html
encoded_text =open("encrypted.txt").read().upper()#normalize to uppercase
sorted_my_frequency = Counter(encoded_text).most_frequent(len(en_freq)) #we want the 25 most frequent characters (sorted
my_frequency=join(sorted_my_frequency)[:len(en_freq)]

translation_table = string.maketrans(my_frequency,en_freq) #map our most frequent  to expected english frequencies

print encoded_text.translate(translation_table) #apply the translation_table
#note that you need a fairly large ammount of text for this to work very well ... and you will likely still need to manually translate some parts

请注意，可能存在一些小错误，因为我实际上没有运行此错误或有任何目标文本要解码

Answer 2

可以尝试逐个字符地进行：

sorted_cipher_freq = [('V', 25), ('D', 10)]
simple_cipher_freq = [letter for letter, freq in sorted_cipher_freq]

en_freq="ETAOINSHDLUCMFYWGPBVKXQJZ"

new_text = ''
for char in cipher_str:
    new_char = en_freq[simple_cipher_freq.index(char)]
    new_text += new_char

print new_text

在字符串中删除所有字母

2 个答案: