此函数的正则表达式

时间:2016-04-07 14:29:54

标签: python regex

我想使用正则表达式简化此功能。 样本输入可以是

text =' At&T, " < I am > , At&T  so  &#60; &lt; &  & '

我的代码:

def replaceentity(text):
    import re
    import uuid
    from cgi import escape
    invalid_chars_map = {'&':'&#38;', '<':'&#60;', '>': '&#62;', '"': "&#34;"}
    replace_values = {'&lt;':'&#60;', '&gt;':'&#62;'}
    replaced_dict = {}
    for key, value in replace_values.items():
        text = text.replace(key, value)
    print "after replace >>>>>>  " + text
    for word in text.split():
        if word in invalid_chars_map.values():
            print word
            uid = str(uuid.uuid4())
            text = text.replace(word, uid)
            replaced_dict[uid] = word
    text = escape(text)
    for i in replaced_dict.keys():
        text = text.replace(i, replaced_dict[i])
    print text

1 个答案:

答案 0 :(得分:0)

这是你想要的吗?

>>> from cgi import escape
>>> escaped = escape("""'At&T, " < I am > , At&T  so  &#60; &lt """)
>>> escaped
'\'At&amp;T, " &lt; I am &gt; , At&amp;T  so  &amp;#60; &amp;lt '