Question

在Java中，必须使用\r\n进行剥离，例如split( "\r\n") is not splitting my string in java

但是在Python中\r\n是必需的吗？以下内容是否正确？

str.strip() == str.strip('\r\n ')

来自docs：

返回带有前导和尾随字符的字符串的副本删除。 chars参数是一个字符串，用于指定要删除的字符。如果省略或无，则为chars参数默认为删除空格。 chars参数不是前缀或后缀;相反，其值的所有组合都被剥离

在此CPython test中，str.strip()似乎正在剥离：

 \t\n\r\f\v

任何人都可以指向CPython中进行字符串剥离的代码吗？

Answer 1

您在寻找这些行吗？

https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Objects/unicodeobject.c#L12222-L12247

#define LEFTSTRIP 0
#define RIGHTSTRIP 1
#define BOTHSTRIP 2

/* Arrays indexed by above */
static const char *stripfuncnames[] = {"lstrip", "rstrip", "strip"};

#define STRIPNAME(i) (stripfuncnames[i])

/* externally visible for str.strip(unicode) */
PyObject *
_PyUnicode_XStrip(PyObject *self, int striptype, PyObject *sepobj)
{
    void *data;
    int kind;
    Py_ssize_t i, j, len;
    BLOOM_MASK sepmask;
    Py_ssize_t seplen;

    if (PyUnicode_READY(self) == -1 || PyUnicode_READY(sepobj) == -1)
        return NULL;

    kind = PyUnicode_KIND(self);
    data = PyUnicode_DATA(self);
    len = PyUnicode_GET_LENGTH(self);
    seplen = PyUnicode_GET_LENGTH(sepobj);
    sepmask = make_bloom_mask(PyUnicode_KIND(sepobj),
                              PyUnicode_DATA(sepobj),
                              seplen);

    i = 0;
    if (striptype != RIGHTSTRIP) {
        while (i < len) {
            Py_UCS4 ch = PyUnicode_READ(kind, data, i);
            if (!BLOOM(sepmask, ch))
                break;
            if (PyUnicode_FindChar(sepobj, ch, 0, seplen, 1) < 0)
                break;
            i++;
        }
    }

    j = len;
    if (striptype != LEFTSTRIP) {
        j--;
        while (j >= i) {
            Py_UCS4 ch = PyUnicode_READ(kind, data, j);
            if (!BLOOM(sepmask, ch))
                break;
            if (PyUnicode_FindChar(sepobj, ch, 0, seplen, 1) < 0)
                break;
            j--;
        }

        j++;
    }

    return PyUnicode_Substring(self, i, j);
}

Answer 2

本质上：

str.strip() == str.strip(string.whitespace) == str.strip(' \t\n\r\f\v') != str.strip('\r\n')

除非您明确尝试仅删除换行符，否则str.strip()和str.strip('\r\n')是不同的。

>>> '\nfoo\n'.strip()
'foo'
>>> '\nfoo\n'.strip('\r\n')
'foo'
>>> '\r\n\r\n\r\nfoo\r\n\r\n\r\n'.strip()
'foo'
>>> '\r\n\r\n\r\nfoo\r\n\r\n\r\n'.strip('\r\n')
'foo'
>>> '\n\tfoo\t\n'.strip()
'foo'
>>> '\n\tfoo\t\n'.strip('\r\n')
'\tfoo\t'

这一切似乎都很好，但是请注意，如果换行符与字符串的开头或结尾之间存在空格（或其他任何字符），则.strip('\r\n')不会删除换行符。

>>> '\t\nfoo\n\t'.strip()
'foo'
>>> '\t\nfoo\n\t'.strip('\r\n')
'\t\nfoo\n\t'

在Python中是否需要用'\ r \ n'剥离字符串？

2 个答案: