Question

我对正则表达式很新，并且非常坚持使用以下表达式。我正在寻找允许以下组合的正则表达式代码：

dates = [datetime.date(2004, 2, 1), datetime.date(2005, 3, 1)]
years = [2004, 2005]

要求：

字符串不能以数字
仅限大写字母（A-Z）
预定位置需要1个空间（见上例）
可以使用数字0-9

我目前正在使用以下字符串

AA1A 1AA AA 12 A 12 A 1

这个问题是它不允许([A-Z]{1,2}|[A-Z0-9]{1,4})([ ]{1})([0-9A-Z]{1,3})字符串..

有什么想法吗？

Answer 1

根据规范，示例以及您想要第一个示例的事实，即使前面有空格，您似乎需要像这样的正则表达式：

^[ ]*([A-Z][A-Z0-9]{0,3})[ ]([A-Z0-9]{1,3})$

您可以对其进行测试here

请注意，^和$会添加到正则表达式中。但是我预感到你在某些工具或功能中使用正则表达式，隐含地假设正则表达式需要匹配整行。因为否则你的原始正则表达式会匹配＆＃34; AA1A 1AA＆＃34;在字符串＆＃34; AA1A 1AA＆＃34;。
如果是这样的话，^和$应该是多余的，你可以删除它们。

说明：

^ :     // Matches the beginning of the string
        // or the beginning of a line if the multiline flag (m) is enabled. 

[ ]* :   // 0 or more spaces

[A-Z] : // an upper case ascii letter

[A-Z0-9]{0,3} : // between 0 and 3 upper case letters or digits

[ ] :   // A character class with a space. Which matches 1 space.   
        // You don't actually need to put a single character in a character class.
        // But here it's done to make the space stand out more.

[A-Z0-9]{1,3} : // Between 1 and 3 upper case letters or digits

$ :     // Matches the end of the string 
        // or the end of a line if the multiline flag (m) is enabled.

中间的空间不会放入捕获组(...)。因为那是什么目的？它不像后来验证捕获组确实包含空格。

如果您想搜索较长字符串中的字符串，可以改用字边界。

\b([A-Z][A-Z0-9]{0,3})[ ]([A-Z0-9]{1,3})\b

\ b是单词边界，它表示单词字符[A-Za-z0-9_]和非单词字符之间的转换。确保您的单词字符遵循或由空格或行的开头或结尾处理是有用的。

例如，如果你有一个类似＆＃34; ABC DE＆＃34;的字符串，那么正则表达式/[A-Z]{2}/g将匹配＆＃34; AB＆＃34;和＆＃34; DE＆＃34;。但是使用wordboundary /\b[A-Z]{2}\b/g它只会匹配＆＃34; DE＆＃34;而不是像＃34; AB＆＃34;这样的单词的一部分。

Answer 2

您只需要优化第一组以处理两者：

两个字母（AA）
两个字母，字母和数字（AA1A）

从（demo here）改变：

/([A-Z]{1,2}|[A-Z0-9]{1,4})([ ]{1})([0-9A-Z]{1,3})/g

到

/([A-Z]{1,2}|[A-Z]{2}[A-Z0-9]{2})([ ]{1})([0-9A-Z]{1,3})/g

匹配（以粗体显示）更改为：

AA1A 1AA
AA 12
11 AB
A 12
11 A
A 1

到：

AA1A 1AA
AA 12
11 AB
A 12
11 A
A 1

（注意11 AB和11 A是相匹配的）

正则表达式IF / Then表达所需的字符

2 个答案: