使用正则表达式多个捕获组来分割字符串

时间:2014-12-02 20:53:40

标签: regex capture-group

我有一个看起来像这样的文件......

"1234567123456","V","0","0","BLAH","BLAH","BLAH","BLAH"
"1234567123456","D","TEST1 "
"1234567123456","D","TEST 2~TEST3"
"1234567123456","R","TEST4~TEST5"
"1234567123457","V","0","0","BLAH","BLAH","BLAH","BLAH"
"1234567123457","D","TEST 6"
"1234567123457","D","TEST7"
"1234567123457","R","TEST 8~TEST9~TEST,10"

我试图做的就是解析D和R线。在这种情况下,〜用作分隔符。所以最终结果将是......

"1234567123456","V","0","0","BLAH","BLAH","BLAH","BLAH"
"1234567123456","D","TEST1 "
"1234567123456","D","TEST3"
"1234567123456","D","TEST3"
"1234567123456","R","TEST4"
"1234567123456","R","TEST5"
"1234567123457","V","0","0","BLAH","BLAH","BLAH","BLAH"
"1234567123457","D","TEST 6"
"1234567123457","D","TEST7"
"1234567123457","R","TEST 8"
"1234567123457","R","TEST9"
"1234567123457","R","TEST,10"

我在Textpad和Notepad ++等应用程序上使用正则表达式。我还没想出如何使用像 /.+/ g 这样的正则表达式,因为应用程序不喜欢正斜杠。所以我不认为我可以使用全局修饰符之类的东西。我目前有以下正则表达式...

//In a program like Textpad/Notepad++
<FIND> "(.{13})","D","([^~]*)~(.*)
<REPLACE> "\1","D","\2"\n"\1","D","\3

现在,如果我运行一个查找并用上面的参数替换几次就可以正常工作(仅适用于D行)。问题是要生成未知数量的行。例如......

"1234567123456","D","TEST1~TEST2~TEST3~TEST4~TEST5"
"1234567123457","D","TEST1~TEST2~TEST3"
"1234567123458","D","TEST1~TEST2"
"1234567123459","D","TEST1~TEST2~TEST3~TEST4"

我希望能够使用MULTI捕获组来完成这项工作。我发现这个PAGE谈论重复捕获组和捕获重复组之间的常见错误。我需要抓住一个重复的小组。出于某种原因,我不能让我的工作正常。其他人有想法吗?

注意:如果我可以摆脱前导和尾随空格EX:&#34; 1234567123456&#34;,&#34; D&#34;,&#34; TEST1&#34;结束为&#34; 1234567123456&#34;,&#34; D&#34;,&#34; TEST1&#34;那会更好但不是必要的。


资源:

0 个答案:

没有答案