Question

我需要使用re.findall()函数来查找所有包含负数（“从不”或“不是”）的双字母组作为以下文本的第一个单词：

他开玩笑说从来没有受伤的伤痕。朱丽叶出现在上方的窗户但是，柔软！窗外有什么光线打破？它是向东，朱丽叶就是太阳。升起，阳光明媚，杀死嫉妒月亮，已经生病和悲伤的苍白，那是她的女仆比她还公平得多：因为她很羡慕，所以不要成为她的女仆。她的既有的衣着又生病又绿，只有傻子才穿。把它扔掉。是我的女士，哦，这是我的爱！哦，她知道她是！她说话却什么也没说：那是什么？她的眼睛话语我会回答。我太大胆了说话：全天上最美丽的两个星星，有一些做生意，请吸引她的眼睛在他们的眼球中闪烁，直到他们返回。如果她的眼睛在她头上，那该怎么办？亮度她的脸颊会羞辱那些星星。她的天堂里的眼睛会穿过通风的区域，如此明亮，鸟儿会唱歌，认为那不是夜晚。看，她如何倾斜她的身体抚摸她的手！哦，我是那只手的手套，可能会碰到脸颊！

我很容易找到一个单词，但是我对寻找二元词感到茫然。

import re
inp = input("please enter an expression: ")
print (re.findall(r'\b(?:never|not)\b', inp))

['never'，'not'，'not'，'not]

我如何获得

[“从不感到”，“不是她”，“不愿意”，“不正确”]

Answer 1

如果您还想在not或never之后加上一个单词，则需要将正则表达式扩展到此，

\b(?:never|not)\s+[a-zA-Z]+

在这里，\s+将匹配一个或多个空格，[a-zA-Z]+将匹配一个具有一个或多个字符的英语单词。

Regex Demo

Python code demo

import re

s = '''He jests at scars that never felt a wound. JULIET appears above at a window But, soft! what light through yonder window breaks? It is the east, and Juliet is the sun. Arise, fair sun, and kill the envious moon, Who is already sick and pale with grief, That thou her maid art far more fair than she: Be not her maid, since she is envious; Her vestal livery is but sick and green And none but fools do wear it; cast it off. It is my lady, O, it is my love! O, that she knew she were! She speaks yet she says nothing: what of that? Her eye discourses; I will answer it. I am too bold, 'tis not to me she speaks: Two of the fairest stars in all the heaven, Having some business, do entreat her eyes To twinkle in their spheres till they return. What if her eyes were there, they in her head? The brightness of her cheek would shame those stars, As daylight doth a lamp; her eyes in heaven Would through the airy region stream so bright That birds would sing and think it were not night. See, how she leans her cheek upon her hand! O, that I were a glove upon that hand, That I might touch that cheek!'''
print(re.findall(r'\b(?:never|not)\s+[a-zA-Z]+', s))

打印

['never felt', 'not her', 'not to', 'not night']

编辑： 就像您说的那样，您要舍弃带有空格和a字符的匹配项，可以使用否定的前瞻并像这样扩展当前的正则表达式，

\b(?:never|not)\s+[a-zA-Z]+\b(?! a\b)

在这里，我在否定前瞻中使用了\b来避免单词的部分匹配，而在否定前瞻中使用了\b之后的a则避免了不只是a而是匹配的单词更像add或and等

Regex Demo where matches are discard if followed by space and a char

Answer 2

x=input()
m = re.findall(r'\b(?:never|not)\b\s+[\w]+', x)
print(m)
# output
['never felt', 'not her', 'not to', 'not night']

re.findall（）查找包含负项的所有双字母组

2 个答案: