我有一个字符串:
mystr = "&marker1\nThe String that I want /\n&marker1\nAnother string that I want /\n"
我想要的是标记start="&maker1"
和end="/\n"
之间的子字符串列表。因此,预期结果是:
whatIwant = ["The String that I want", "Another string that I want"]
我在这里阅读了答案:
并尝试了此尝试,但未成功
>>> import re
>>> mystr = "&marker1\nThe String that I want /\n&marker1\nAnother string that I want /\n"
>>> whatIwant = re.search("&marker1(.*)/\n", mystr)
>>> whatIwant.group(1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'
该如何解决?另外,我的字符串很长
>>> len(myactualstring)
7792818
答案 0 :(得分:1)
使用 {
"genreId" : 1,
"name" : "Comedy",
"_links" : {
"self" : {
"href" : "http://localhost:8080/api/genres/1"
},
"genre" : {
"href" : "http://localhost:8080/api/genres/1"
},
"films" : {
"href" : "http://localhost:8080/api/genres/1/films"
}
}
}
来考虑此选项:
re.findall
此打印:
mystr = "&marker1\nThe String that I want /\n&marker1\nAnother string that I want /\n"
matches = re.findall(r'&marker1\n(.*?)\s*/\n', mystr)
print(matches)
以下是正则表达式模式的说明:
['The String that I want', 'Another string that I want']
请注意,&marker1 match a marker
\n newline
(.*?) match AND capture all content until reaching the first
\s* optional whitespace, followed by
/\n / and newline
将仅捕获re.findall
捕获组中显示的内容,这就是您要提取的内容。
答案 1 :(得分:1)
该如何解决? 我会的:
import re
mystr = "&marker1\nThe String that I want /\n&marker1\nAnother string that I want /\n"
found = re.findall(r"\&marker1\n(.*?)/\n", mystr)
print(found)
输出:
['The String that I want ', 'Another string that I want ']
请注意:
&
在re
模式中具有特殊含义,如果要使用文字,则需要对其进行转义(\&
).
匹配除换行符之外的所有内容findall
,search
更适合选择
*?
是非贪婪的,在这种情况下.*
也可以工作,因为.
与换行符不匹配,但在其他情况下,匹配结束可能会超出您的期望阅读模块re
documentation,以讨论原始字符串的用法和具有特殊含义的隐式字符列表。