Question

我试图创建一个简单的＆＃34;乱码发生器＆＃34; Python中的程序，它打印一串随机乱码，由字符，空格和末尾的标点符号组成（换句话说就是一个完整的句子）。它已经基本上已经工作了，但我遇到了一个奇怪的问题，我无法理解。

不知怎的，最后一个字＃34;尽管我的代码明确限制任何超过11个字符的单词，但在我的乱码字符串中总是比它应该更长。在经历了上帝的代码之后，我知道有多少次我还没有得到可能造成这种情况的原因。有趣的是，它只会变得非常明显，长串，而短句（最多50个字符）看起来很好。

这是我在Windows PowerShell中运行时获得的两个示例输出：

首先有50个字符：

您想要打印多少个乱码？ 50

Uxlouasieyt uoygigjas eayouiumza gyfejmu th egkyaulheeb。

第二名有300个字符：

您想要打印多少个乱码？ 300

Yhiaztexj ekkexe iiuiyx itozlyui zao cegyeuyiml aofzyyreet cofi owzycwobla rreyblioca rla tpocnelavj ytpa   x eefra gnyoe yfxyhnivme miert ywy ykhi ee gup eui ttuoi oeoyaf uenyecb apluo yli xmy uiyaoneewe jyxymxal   y dzaiglu uo eqkiyeiz ke oxayuiayzf yyi iqoezu ekuioyotly viyslaybiiwvymitoeagrejvavihigpyoxawefunodgu！

注意句子中的最后一个单词如何越长越长，而所有排除单词都保持在11个字符以内。好像在某个点之后忽略了在gibberish_list中添加空格的代码部分。但为什么呢？

以下是完整的代码：

import random

def gibberishgen():
    alphabet_vowels = ['a','e','i','o','u','y',]
    alphabet_consonants = ['b','c','d','f','g','h','j','k','l','m','n','p','q','r','s','t','v','w','x','z']
    gibberish_list = []

    while True:
        gibberishamount = raw_input("How many gibberish characters would you like to print out? ")
        if gibberishamount.isdigit():
            break
        else:
            print "Please give me a number!"

    # fill the gibberish_list with characters
    lasttwochars = ['','']
    for char in range(1, int(gibberishamount)+1):
        nextcharvowel = random.choice(alphabet_vowels)
        nextcharconsonant = random.choice(alphabet_consonants)
        if lasttwochars[0] in alphabet_consonants and lasttwochars[1] in alphabet_consonants:   # because I don't want more than 2 consonants in a row
            nextchar = nextcharvowel
        else:
            roll = random.randint(1,10)
            if roll > 5:
                nextchar = nextcharvowel
            else:   
                nextchar = nextcharconsonant
        gibberish_list.append(nextchar)
        lasttwochars.append(nextchar)
        lasttwochars.pop(0)

    # insert spaces at randomized intervals to separate the "words" from each other
    last_whitespace = 0
    for index in range(0, len(gibberish_list)+1):
        randspace = random.randint(1,10)
        if index >= last_whitespace + 3 and randspace <= 2:     # make sure words don't get too short on average
            gibberish_list.insert(index, ' ')
            last_whitespace = index
        elif index > last_whitespace + 10:                      # ...or too long
            gibberish_list.insert(index, ' ')
            last_whitespace = index

    punctlist = ['.', '!', '?']

    gibberishstring = ''.join(gibberish_list)
    finalstring = gibberishstring.capitalize() + random.choice(punctlist)
    print "\n", finalstring, "\n"

gibberishgen()

如果有人向我解释这里发生了什么，我将不胜感激。我学习python只有两个月了，所以是的，很可能我错过了一些显而易见的东西。

也可以随意指出任何错误的语法或练习。

Answer 1

当您在gibberish_list中插入空格时，它会变得越来越长，但是当您启动时，您的循环会停在与gibberish_list中的最后一个字符对应的字符索引处迭代，所以它永远不会到达列表的末尾，越多的空格就越明显（即更长的字符串）。

Answer 2

这是一个有点扩展的版本：

它适用于Python 2.x和3.x，并使用逼真的字母和字长。

from itertools import islice
from random import choice, randint
import sys

if sys.hexversion < 0x3000000:
    inp = raw_input
    rng = xrange
else:
    inp = input
    rng = range


LETTERS = (    # relative character frequencies
    "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabb"
    "bbbbbbbbbbcccccccccccccccccccccdddddddddddddddddddddddddddddddde"
    "eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee"
    "eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeefffffffffffffffffgggggggggggggggg"
    "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhiiiiiiiiiiiiiiiiii"
    "iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiijjkkkkkklllllllllllllllllllll"
    "llllllllllmmmmmmmmmmmmmmmmmmmnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn"
    "nnnnnnnnnnnnnnnnoooooooooooooooooooooooooooooooooooooooooooooooo"
    "ooooooooopppppppppppppppqrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr"
    "rrrrrrsssssssssssssssssssssssssssssssssssssssssssssssstttttttttt"
    "ttttttttttttttttttttttttttttttttttttttttttttttttttttttttttuuuuuu"
    "uuuuuuuuuuuuuuuvvvvvvvvwwwwwwwwwwwwwwwwwwxxxyyyyyyyyyyyyyyyzz"
)

CONSONANTS  = ''.join(ch for ch in LETTERS if ch not in "aeiouy")
VOWELS      = ''.join(ch for ch in LETTERS if ch     in "aeiouy")
PUNCTUATION = "....??!"

is_cons     = set(CONSONANTS).__contains__    # is_cons(x) == x in set(CONSONANTS)

WORDLEN = [     # relative word-length frequencies
    2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,
    2,  2,  2,  2,  2,  3,  3,  3,  3,  3,  3,  3,
    3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,
    3,  3,  3,  3,  4,  4,  4,  4,  4,  4,  4,  4,
    4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,
    5,  5,  5,  5,  5,  5,  5,  5,  5,  5,  5,  5,
    5,  5,  6,  6,  6,  6,  6,  6,  6,  6,  6,  6,
    7,  7,  7,  7,  7,  7,  7,  7,  8,  8,  8,  8,
    8,  9,  9,  9, 10, 10, 10, 11, 11, 12
]

wordlen = lambda: choice(WORDLEN)

def get_int(prompt):
    while True:
        try:
            return int(inp(prompt))
        except ValueError:
            pass

def gibberish():
    """
    Generate an infinite sequence of random letters,
      allowing no more than two consecutive consonants
    """
    a = choice(LETTERS); yield a
    b = choice(LETTERS); yield b
    while True:
        c = choice(VOWELS if is_cons(a) and is_cons(b) else LETTERS)
        yield c
        a, b = b, c

def take_n(iterable, n):
    return list(islice(iterable, n))

def add_spaces(iterable, make_word_length):
    iterable = iter(iterable)
    while True:
        for i in rng(make_word_length()):
            yield next(iterable)
        yield ' '

def gibberish_sentence():
    length   = get_int("How many characters of gibberish would you like? ")
    chars    = take_n(gibberish(), length)              # make that many chars
    chars    = add_spaces(chars, wordlen)               # add spaces to make "words"
    sentence = ''.join(chars).rsplit(' ', 1)[0]         # crop at last space (don't leave a part-word at the end)
    return sentence.capitalize() + choice(PUNCTUATION)  # capitalize and add punctuation

def main():
    print(gibberish_sentence())

if __name__=="__main__":
    main()

示例输出：

How many characters of gibberish would you like? 180
Ahisent anoe tfon evaer an irpenn otjievt ecfiotuee ebaa wtah sav hii lti
ukt erd elrihe dewa st aosdeec zenle acju ld eeaotl entetom wisvos
aeatresl oixb atidb eekermo nteu darso hligseoanei vhaeoedse qyr sogudc.

Python：“乱码语句生成器”，以怪异的方式行为不端

2 个答案: