我有这个字符串:
remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820,remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820
我想匹配并提取用逗号分隔的字符串。
结果应为:
MATCH 1
'remote:City|Vestavia Hills,AL'
MATCH 2
'remote:Citystate|Vestavia Hills'
MATCH 3
'395b5231539390675a7abe0751fc4820'
MATCH 4
'remote:City|Vestavia Hills,AL'
MATCH 5
'remote:Citystate|Vestavia Hills'
MATCH 6
'395b5231539390675a7abe0751fc4820'
我有这个正则表达式:
(remote:[a-zA-Z]+\|[^\,]+|[a-f0-9]{32})
但是那些状态为“AL”(以逗号分隔)的城市被错误地分开了。
可能的解决方案:
我正在考虑做这样的事情 - remote:[a-zA-Z]+\|.*
- 并且在逗号后面有自我(remote:[a-zA-Z]+\|.*
)或md5哈希([a-f0-9]{32},?
)的结束匹配。
这是我的正则表达式测试人员链接:
答案 0 :(得分:1)
您可以将正则表达式微调到这个基于前瞻性的正则表达式:
/(?:^|,)(.+?(?=,(?:[a-f0-9]{32}|remote:)|$))/igm
这将为您提供6个被捕获的群组。
(?:^|,) # Match line start or comma
( # captured group #1 start
.+? # match 1 or more of any character (lazy)
(?= # lookahead start
, # match comma followed by
(?: # non-capturing group start
[a-f0-9]{32} # match hex digit 32 times
| # OR
remote: # match literal "remote:"
) # non-capturing group end
| # OR
$ # line end
) # looakehad end
) # capturing group #1 end
答案 1 :(得分:1)
([a-f0-9]{32}|remote:[^|]+\|[^,]+(?:,[A-Z]{2})?),?
这个更容易理解,我为该组制作了一个特殊的可选sufix,而逗号后只能是2个uppcase字母。
答案 2 :(得分:1)
使用单个正则表达式,您可以执行以下操作;
var str = "remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820,remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820",
arr = str.match(/(r.+?|[\da-f]{32})(?=,?(remote|[\da-f]{32}|$))/g);
console.log(arr);
答案 3 :(得分:0)
一种选择是使用javascript的分割:
var str = "remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820,remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820";
var aux = str.split("remote");
var res = [];
for (var i=1 ; i < aux.length ; i++){
res.push("remote" + aux[i]);
};
console.log(res);