随机文本和字母对的计数器

时间:2012-01-07 20:12:12

标签: python

我使用的是成功创建相对于前一对的字母的计数器。

def pairwise(iterable):
    it = iter(iterable)
    last = next(it) + next(it)
    for curr in it:
        yield last, curr
        last = last[1]+curr


valid = set('abcdefghijklmnopqrstuvwxyz ')

def valid_pair((last, curr)):
    return last[0] in valid and last[1] in valid and curr in valid


def make_markov(text):
    markov = defaultdict(Counter)
    lowercased = (c.lower() for c in text)
    for p, q in ifilter(valid_pair, pairwise(lowercased)):
        markov[p][q] += 1
    return markov

但我现在想要根据前一对的计数器生成每个字母的随机文本。以下是字母仅取决于前一个字母时使用的代码。

def genrandom(model, n):
    curr = choice(list(model)) 
    for i in xrange(n):
        yield curr
        if curr not in model:   
            curr = choice(list(model))
        d = model[curr]   
        target = randrange(sum(d.values()))
        cumulative = 0
        for curr, cnt in d.items():
            cumulative += cnt
            if cumulative > target:
                break

我无法适应第二种配置,输出与我期望的不一致。谢谢!

1 个答案:

答案 0 :(得分:1)

我想,你忘记了curr是两个组合。应该改变最后一个循环并在它之后构造curr:

    for newcurr, cnt in d.items():
        cumulative += cnt
        if cumulative > target:
            break

    curr = curr[1] + newcurr

同样应该改变产量,一次只产生一个字符