我想要一些正则表达式的帮助。我编写代码来获取单词及其含义并将其添加到字典中。目前,我正在使用的正则表达式代码具有以下功能:
1- no space at the start of the sentence.
2- no space more than one after a word.
3- letters from a-z and A-Z are allowed.
4- no space at the end of a sentence.
代码:
while not re.match("^[a-zA-Z_]+( [a-zA-Z_]+)*$",meaning):
但是,我也希望它允许:
1- a "full stop" aka "."
2- "apostrophies" aka '""'
3- and "comma".
所以新的正则表达式应该用于意义,但是对于我想要的单词:
1- no space at the start of the sentence.
2- no space more than one after a word.
3- letters from a-z and A-Z are allowed.
4- no space at the end of a sentence.
5- allow a "full stop" aka "."
6- max two words.
请告诉我两个正则表达式。 THX!
答案 0 :(得分:1)
^(?:[a-zA-Z_]+\.?(?: (?!$)|$)){1,2}$
演示:https://regex101.com/r/gZ3cH8/1
假设和注释:
说明:
^(?:...){1,2}$
- 确保内部部分(此处用省略号替换)仅出现一次或两次[a-zA-Z_]+\.?(?: (?!$)|$)
:
[a-zA-Z_]+
- 带有标点符号后跟\.?
- 一个可选的点和(?: (?!$)|$)
- 一个空格,但前提是空格不会紧跟行尾^(?:"?[a-zA-Z_]+"?(?:,(?!$)|\.)?(?: (?!$)|$))+$
演示:https://regex101.com/r/yV2uW9/1
假设和注释:
2- "apostrophies" aka '""'
相互矛盾。我认为你需要允许双引号("
)而不是撇号('
)。说明:
^(?:...)+$
- 确保内部部分(此处用省略号替换)仅出现一次或多次"?[a-zA-Z_]+"?(?:,(?!$)|\.)?(?: (?!$)|$)
- 带有惩罚的单词,后跟可选空格:
"?
可选的开场引语[a-zA-Z_]+
- 带有标点符号后跟"?
可选的结束引号和(?:,(?!$)|\.)?
- 可选的逗号或点,但逗号不应紧跟行尾(?: (?!$)|$)
- 一个空格,但前提是空格不会紧跟行尾