我想将一个字符串拆分为具有共享某些属性的连续字母的子字符串:特别是字母数字(尽管对一般解决方案感兴趣)。
E.g。 "string#example[is-like="html"].selectors"
将匹配[string, #, example, [, is, -, like, =", html, "]., selectors]
知道如何在RegEx中执行此操作吗?谢谢!
编辑:我将通过preg_match_all
使用PHP的RegEx引擎。
答案 0 :(得分:2)
\w+|\W+
单词字符的一个或多个后果 OR 非单词字符的一个或多个后果
<强>输出强>:
Array
(
[0] => string
[1] => #
[2] => example
[3] => [
[4] => is
[5] => -
[6] => like
[7] => ="
[8] => html
[9] => "].
[10] => selectors
)
答案 1 :(得分:1)
使用word boundary anchor,例如在C#:
中splitArray = Regex.Split(subjectString, @"\b");
如果您想避免在字符串的开头/结尾处出现空匹配,请将其与lookaround assertions结合使用:
splitArray = Regex.Split(subjectString, @"(?<!^)\b(?!$)");
<强>说明:强>
(?<!^) # Assert we're not at the start of the string
\b # Match a position between an alnum an a non-alnum character
(?!$) # Assert we're not at the end of the string, either
通用解决方案如下所示:
假设您想要在数字(\d
)和非数字(\D
)之间进行拆分。然后你可以使用
splitArray = Regex.Split(subjectString, @"(?<=\d)(?=\D)|(?<=\D)(?=\d)");
<强>说明:强>
(?<=\d) # Assert that the previous character is a digit
(?=\D) # and the next character is a non-digit.
| # Or:
(?<=\D) # Assert that the previous character is a non-digit
(?=\d) # and the next character is a digit.