Javascript正则表达式逗号分隔文本

时间:2016-09-24 13:02:45

标签: javascript regex regex-negation regex-lookarounds

我有这个字符串:

remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820,remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820

我想匹配并提取用逗号分隔的字符串。

结果应为:

MATCH 1 
'remote:City|Vestavia Hills,AL' 
MATCH 2 
'remote:Citystate|Vestavia Hills' 
MATCH 3 
'395b5231539390675a7abe0751fc4820' 
MATCH 4 
'remote:City|Vestavia Hills,AL' 
MATCH 5 
'remote:Citystate|Vestavia Hills' 
MATCH 6 
'395b5231539390675a7abe0751fc4820'

我有这个正则表达式:

(remote:[a-zA-Z]+\|[^\,]+|[a-f0-9]{32})

但是那些状态为“AL”(以逗号分隔)的城市被错误地分开了。

可能的解决方案:

我正在考虑做这样的事情 - remote:[a-zA-Z]+\|.* - 并且在逗号后面有自我(remote:[a-zA-Z]+\|.*)或md5哈希([a-f0-9]{32},?)的结束匹配。

这是我的正则表达式测试人员链接:

https://regex101.com/r/rP8iJ2/1

4 个答案:

答案 0 :(得分:1)

您可以将正则表达式微调到这个基于前瞻性的正则表达式:

/(?:^|,)(.+?(?=,(?:[a-f0-9]{32}|remote:)|$))/igm

这将为您提供6个被捕获的群组。

Updated RegEx Demo

(?:^|,)                 # Match line start or comma
(                       # captured group #1 start
   .+?                  # match 1 or more of any character (lazy)
   (?=                  # lookahead start
      ,                 # match comma followed by
      (?:               # non-capturing group start
         [a-f0-9]{32}   # match hex digit 32 times
         |              # OR
         remote:        # match literal "remote:"
      )                 # non-capturing group end
      |                 # OR
      $                 # line end
   )                    # looakehad end
)                       # capturing group #1 end

答案 1 :(得分:1)

([a-f0-9]{32}|remote:[^|]+\|[^,]+(?:,[A-Z]{2})?),?

这个更容易理解,我为该组制作了一个特殊的可选sufix,而逗号后只能是2个uppcase字母。

https://regex101.com/r/rP8iJ2/3

答案 2 :(得分:1)

使用单个正则表达式,您可以执行以下操作;

var str = "remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820,remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820",
    arr = str.match(/(r.+?|[\da-f]{32})(?=,?(remote|[\da-f]{32}|$))/g);
console.log(arr);

答案 3 :(得分:0)

一种选择是使用javascript的分割:

var str = "remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820,remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820";
var aux = str.split("remote");
var res = [];
for (var i=1 ; i < aux.length ; i++){
	res.push("remote" + aux[i]);
};
console.log(res);