Question

我不明白。为什么人们不加解释地对此投下反对票？我犯了什么错误？

如何从以下字符串中提取Apple Recipe，3，pages，29.4KB？

'\r\n\t\t\t\t\t\r\n\t\t\t\t\tApple Recipe\r\r\n\t\t\t\t\t\r\n\t\t\t\t\t\r\n
\t\t\t\t\t3\r\n\t\t\t\t\t\t\r\n\t\t\t\t\t\t\tpages\r\n
\t\t\t\t\t\t\t\r\n\t\t\t\t\t\t\r\n\t\t\t\t\t\r\n
\t\t\t\t\t\r\n\t\t\t\t\t\t29.4KB\r\n
\t\t\t\t\t\r\n\t\t\t\t\t\r\n\t\t\t\t'

我尝试过re.compile('\w+')，但只能得到如下结果：

Apple

Recipe

29

.

4

KB

但是，我想按原样将它们放在一起，而不是分开。例如，我想将Apple Recipe放在一起，但不要作为两个单独的标记。

Answer 1

data = """\r\n\t\t\t\t\t\r\n\t\t\t\t\tApple Recipe\r\r\n\t\t\t\t\t\r\n\t\t\t\t\t\r\n
\t\t\t\t\t3\r\n\t\t\t\t\t\t\r\n\t\t\t\t\t\t\tpages\r\n
\t\t\t\t\t\t\t\r\n\t\t\t\t\t\t\r\n\t\t\t\t\t\r\n
\t\t\t\t\t\r\n\t\t\t\t\t\t29.4KB\r\n
\t\t\t\t\t\r\n\t\t\t\t\t\r\n\t\t\t\t"""

import re

g = re.findall(r'[^\r\n\t]+', data)
print(g)

打印：

['Apple Recipe', '3', 'pages', '29.4KB']

[^\r\n\t]+将匹配任何不包含\r，\n或\t字符的字符串。

Answer 2

txt = """\r\n\t\t\t\t\t\r\n\t\t\t\t\tApple Recipe\r\r\n\t\t\t\t\t\r\n\t\t\t\t\t\r\n
\t\t\t\t\t3\r\n\t\t\t\t\t\t\r\n\t\t\t\t\t\t\tpages\r\n
\t\t\t\t\t\t\t\r\n\t\t\t\t\t\t\r\n\t\t\t\t\t\r\n
\t\t\t\t\t\r\n\t\t\t\t\t\t29.4KB\r\n
\t\t\t\t\t\r\n\t\t\t\t\t\r\n\t\t\t\t"""

import re

output = re.findall(r'\w+[.\d]?\w+', txt)
print(output)

u将获得所需的输出

['Apple', 'Recipe', '3', 'pages', '29.4KB']

python regex：从转义序列中提取字符串

2 个答案: