使用正则表达式捕获文本模式的错误格式

时间:2019-10-09 16:43:30

标签: regex powershell

我有一个空文件夹,它们的开头有NA -。这些的正确格式应为

  

“不适用-文件夹名称”

不幸的是,并非所有人都遵循此命名约定。我试图编写一个正则表达式来捕获NA的所有模式-格式不正确或重复NA-

到目前为止我想出的正则表达式是

  

N?A?-((?![0-9.A-MO-Z] [B-Z] |)|。?N?A?-)

这是我用来测试的文件夹名称。名称中具有 不正确 的那些具有 NA-格式不正确,而我想要捕获 NA-< / em>格式:

NA -IncorrectFolderName1
N A-1. IncorrectFolderName2
N A- 1. IncorrectFolderName3
NA-IncorrectFolderName4
N A -1.IncorrectFolderName5
NA -NA -IncorrectFolderName6
NA - NA -IncorrectFolderName7
N A - N A - IncorrectFolderName8
N A - NA - IncorrectFolderName9

NA - CorrectFolderName1
NA - 1CorrectFolderName2
NA - 1. CorrectFolderName3

请参阅此处的代码以尝试执行以下操作:https://regex101.com/r/9Bzo43/6

我的代码无法捕获的唯一不正确的格式是:

N A- 1. IncorrectFolderName3

正则表达式不应捕获格式正确的“ NA-”文件夹,如以下文件夹。这些代码不应被捕获。

RegularFolderName1
NA - CorrectFolderName1
NA - 1CorrectFolderName2
NA - 1. CorrectFolderName3

我一直在研究Regex,但是我很接近,但是我似乎无法弄清楚如何编写它来找到所有所需的错误代码模式。任何帮助将不胜感激。

2 个答案:

答案 0 :(得分:1)

我想,也许

(?:(?:N\s*A\s*)-\s*){2}|N\s+A\s*-\s*(?=\d+\.\s*)|NA-|NA\s+-(?=\S)

或一些带有交替的相似表达式将很容易编写和调试。

Demo

我不确定我们会也不想在结尾处捕获什么,您不想滑动/捕获的任何结尾部分都可以简单地将其放置在正面的(?=)中,即零宽度的断言,例如:

NA\s+-(?=\S)

RegEx电路

jex.im可视化正则表达式:

enter image description here


如果您希望简化/修改/探索表达式,请在regex101.com的右上角进行说明。如果愿意,您还可以在this link中查看它如何与某些示例输入匹配。


答案 1 :(得分:0)

使用您的示例,该方法有效:

$foldernames =  'NA -IncorrectFolderName1',
                'N A-1. IncorrectFolderName2',
                'N A- 1. IncorrectFolderName3',
                'NA-IncorrectFolderName4',
                'N A -1.IncorrectFolderName5',
                'NA -NA -IncorrectFolderName6',
                'NA - NA -IncorrectFolderName7',
                'N A - N A - IncorrectFolderName8',
                'N A - NA - IncorrectFolderName9',
                'RegularFolderName1',
                'NA - CorrectFolderName1',
                'NA - 1CorrectFolderName2',
                'NA - 1. CorrectFolderName3'

$newNames = $foldernames | ForEach-Object { $_ -replace '^(?:(N\s*A\s*-\s*))+(.+)', 'NA - $2' }

$newNames

结果:

NA - IncorrectFolderName1
NA - 1. IncorrectFolderName2
NA - 1. IncorrectFolderName3
NA - IncorrectFolderName4
NA - 1.IncorrectFolderName5
NA - IncorrectFolderName6
NA - IncorrectFolderName7
NA - IncorrectFolderName8
NA - IncorrectFolderName9
RegularFolderName1
NA - CorrectFolderName1
NA - 1CorrectFolderName2
NA - 1. CorrectFolderName3

正则表达式详细信息:

(?:              Match the regular expression below
   (             Match the regular expression below and capture its match into backreference number 1
      N          Match the character “N” literally
      \s         Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
         *       Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
      A          Match the character “A” literally
      \s         Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
         *       Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
      -          Match the character “-” literally
      \s         Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
         *       Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
   )
)+               Between one and unlimited times, as many times as possible, giving back as needed (greedy)
(                Match the regular expression below and capture its match into backreference number 2
   .             Match any single character that is not a line break character
      +          Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)