更新2:https://regex101.com/r/bE5aWW/2
更新:到目前为止,https://regex101.com/r/bE5aWW/1/是我可以提出的,但是需要摆脱的帮助。
案例1
\n \n by name name\n \n
案例2
\n \n name name\n \n
案例3
by name name
案例4
name name
我想从上述字符串中选择名称部分,即name name
。我想到的一个是(?:by)? ([\w ]+)
前没有空格时,by
不能工作。
谢谢
regex101中的代码
# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility
import re
regex = r"(?:by)? ([\w ]+)"
test_str = ("\\n \\n by Ally Foster\\n \\n \n\n"
"\\n \\n Ally Foster\\n \\n \n\n"
"by name name\n\n"
"name name")
matches = re.finditer(regex, test_str, re.MULTILINE)
for matchNum, match in enumerate(matches):
matchNum = matchNum + 1
print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))
for groupNum in range(0, len(match.groups())):
groupNum = groupNum + 1
print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))
# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.
答案 0 :(得分:1)
(?:by )?(\b(?!by\b)[\w, ]+\S)
我的最终版本也不会选择仅包含by
的字符串
答案 1 :(得分:0)
我建议使用
re.findall(r'\b(?!by\b)[^\W\d_]+(?: *(?:, *)?[^\W\d_]+)*', s)
请参见regex demo。在Python 2中,您将需要传递re.U
标志以使所有速记字符类和单词边界识别Unicode。要同时匹配制表符而不是空格,请用[ \t]
替换空格。
详细信息
\b
-单词边界(?!by\b)
-下一个单词不能为by
[^\W\d_]+
-一个或多个字母(?: *(?:, *)?[^\W\d_]+)*
-与以下情况的0次或更多次匹配的非捕获组:
*
-零个或多个空格(?:, *)?
-,
和0+个空格的可选序列[^\W\d_]+
-一个或多个字母。