子类化字符串中的Santizing输入无法按预期工作

时间:2016-12-23 21:46:09

标签: string python-2.7 replace

我有一个用例,我想清理作为str类输入的字符串。换句话说,删除字符串中的控制字符。

我试过这个

[hamartin@Guvny bin]$ ipython
Python 2.7.12 (default, Sep 29 2016, 13:30:34) 
Type "copyright", "credits" or "license" for more information.

IPython 3.2.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: class LogString(str):
   ...:     def __init__(self, msg, *args, **kwargs):
   ...:         nstr = msg.replace('\xc2', '')
   ...:         nstr = nstr.replace('\xa0', ' ')
   ...:         super(LogString, self).__init__(nstr, *args, **kwargs)
   ...:         

In [2]: repr(LogString('Testing this out'))
Out[2]: "'Testing\\xc2\\xa0this\\xc2\\xa0out'"

我知道替换这种特定情况的工作。

[hamartin@Guvny bin]$ ipython
Python 2.7.12 (default, Sep 29 2016, 13:30:34) 
Type "copyright", "credits" or "license" for more information.

IPython 3.2.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: i = 'Testing this out'

In [2]: repr(i)
Out[2]: "'Testing\\xc2\\xa0this\\xc2\\xa0out'"

In [3]: i = i.replace('\xc2', '')

In [4]: repr(i.replace('\xa0', ' '))
Out[4]: "'Testing this out'"

In [5]:

我没有将原始字符串存储在除临时变量之外的任何位置。我在将字符传递给树之前替换字符。为什么创建的对象中包含原始字符串而不是“已清理”的字符串?

1 个答案:

答案 0 :(得分:1)

Python中的字符串是不可改变的。由于您是str的子类,因此在提供值后无法更改该值。相反,覆盖__new__静态方法:

class LogString(str):
    def __new__(cls, msg):
        nstr = msg.replace('\xc2', '')
        nstr = nstr.replace('\xa0', ' ')
        return str.__new__(cls, nstr)

希望这有帮助!