Python - 标记,替换单词

时间:2013-05-16 09:30:01

标签: python dictionary tokenize string-parsing

我正在尝试创建类似句子的内容,并随机输入单词。具体来说,我有类似的东西:

"The weather today is [weather_state]."

能够做一些事情,比如在[方括号]中查找所有标记,然后将它们换成字典或列表中的随机副本,留下我:

"The weather today is warm."
"The weather today is bad."

"The weather today is mildly suiting for my old bones."

请记住,[括号]标记的位置不会始终位于相同的位置,并且我的字符串中会有多个括号内的标记,例如:

"[person] is feeling really [how] today, so he's not going [where]."

我真的不知道从哪里开始,或者这是使用tokenize或token模块的最佳解决方案。任何暗示我都朝着正确方向发展的提示都非常感激!

编辑:为了澄清,我不需要使用方括号,任何非标准字符都可以。

3 个答案:

答案 0 :(得分:4)

您正在寻找带有回调函数的re.sub:

words = {
    'person': ['you', 'me'],
    'how': ['fine', 'stupid'],
    'where': ['away', 'out']
}

import re, random

def random_str(m):
    return random.choice(words[m.group(1)])


text = "[person] is feeling really [how] today, so he's not going [where]."
print re.sub(r'\[(.+?)\]', random_str, text)

#me is feeling really stupid today, so he's not going away.   

请注意,与format方法不同,这允许对占位符进行更复杂的处理,例如

[person:upper] got $[amount if amount else 0] etc

基本上,您可以在此基础上构建自己的“模板引擎”。

答案 1 :(得分:2)

您可以使用format方法。

>>> a = 'The weather today is {weather_state}.'
>>> a.format(weather_state = 'awesome')
'The weather today is awesome.'
>>>

此外:

>>> b = '{person} is feeling really {how} today, so he\'s not going {where}.'
>>> b.format(person = 'Alegen', how = 'wacky', where = 'to work')
"Alegen is feeling really wacky today, so he's not going to work."
>>>

当然,这种方法仅适用于 IF ,您可以从方括号切换到卷曲括号。

答案 2 :(得分:0)

如果您使用大括号而不是括号,那么您的字符串可以用作string formatting template。你可以使用itertools.product填充大量的替换:

import itertools as IT

text = "{person} is feeling really {how} today, so he's not going {where}."
persons = ['Buster', 'Arthur']
hows = ['hungry', 'sleepy']
wheres = ['camping', 'biking']

for person, how, where in IT.product(persons, hows, wheres):
    print(text.format(person=person, how=how, where=where))

产量

Buster is feeling really hungry today, so he's not going camping.
Buster is feeling really hungry today, so he's not going biking.
Buster is feeling really sleepy today, so he's not going camping.
Buster is feeling really sleepy today, so he's not going biking.
Arthur is feeling really hungry today, so he's not going camping.
Arthur is feeling really hungry today, so he's not going biking.
Arthur is feeling really sleepy today, so he's not going camping.
Arthur is feeling really sleepy today, so he's not going biking.

要生成随机句子,您可以使用random.choice

for i in range(5):
    person = random.choice(persons)
    how = random.choice(hows)
    where = random.choice(wheres)
    print(text.format(person=person, how=how, where=where))

如果您必须使用括号并且您的格式没有大括号,那么 可以用括号替换括号,然后按上述步骤进行:

text = "[person] is feeling really [how] today, so he's not going [where]."
text = text.replace('[','{').replace(']','}')