我有以下消息:
msg = "Cowlishaw Street & Athllon Drive, Greenway now free of obstruction."
我想改变诸如" Drive"到" Dr"或"街"到" St"
expected_msg = "Cowlishaw St and Athllon Dr Greenway now free of obstruction"
我还有一个"转换功能"
如果列表中有这样的单词,我如何检查列表。如果是,请使用"转换"进行更改。功能。 "转换"是一个字典,其中包含" Drive"充当关键,价值是" Dr"
这就是我所做的
def convert_message(msg, conversion):
msg = msg.translate({ord(i): None for i in ".,"})
tokens = msg.strip().split(" ")
for x in msg:
if x in keys (conversion):
return " ".join(tokens)
答案 0 :(得分:0)
不是简单的:
translations = {'Drive': 'Dr'}
for index, token in enumerate(tokens):
if token in conversion:
tokens[index] = conversion[token]
return ' '.join(tokens)
但是,这不适用于像"Obstruction on Cowlishaw Street."
这样的句子,因为令牌现在是Street.
。也许你应该使用re.sub
的正则表达式:
import re
def convert_message(msg, conversion):
def translate(match):
word = match.group(0)
if word in conversion:
return conversion[word]
return word
return re.sub(r'\w+', translate, msg)
此处re.sub
找到一个或多个连续的(+
)字母数字字符(\w
);并且对于每个这样的正则表达式匹配调用给定的函数,给出匹配作为参数;可以使用match.group(0)
检索匹配的单词。该函数应返回给定匹配的替换 - 这里,如果在字典中找到该单词,则返回该字符,否则返回原始字符。
因此:
>>> msg = "Cowlishaw Street & Athllon Drive, Greenway now free of obstruction."
>>> convert_message(msg, {'Drive': 'Dr', 'Street': 'St'})
'Cowlishaw St & Athllon Dr, Greenway now free of obstruction.'
对于&
,在Python 3.4+上,您应该使用html.unescape
来解码HTML实体:
>>> import html
>>> html.unescape('Cowlishaw Street & Athllon Drive, Greenway now free of obstruction.')
'Cowlishaw Street & Athllon Drive, Greenway now free of obstruction.'
这将处理所有已知的HTML实体。对于较旧的python版本,您可以看到alternatives on this question。
正则表达式与&
字符不匹配;如果你想要替换它,我们可以使用正则表达式\w+|.
,这意味着:“任何连续的字母数字字符,或者任何不在这样的运行中的单个字符”:
import re
import html
def convert_message(msg, conversion):
msg = html.unescape(msg)
def translate(match):
word = match.group(0)
if word in conversion:
return conversion[word]
return word
return re.sub(r'\w+|.', translate, msg)
然后你可以做
>>> msg = 'Cowlishaw Street & Athllon Drive, Greenway now free of obstruction.'
>>> convert_message(msg, {'Drive': 'Dr', '&': 'and',
'Street': 'St', '.': '', ',': ''})
'Cowlishaw St and Athllon Dr Greenway now free of obstruction'