我需要根据空格作为分隔符拆分下面的字符串。但是应该保留引文中的任何空格。
research library "not available" author:"Bernard Shaw"
到
research
library
"not available"
author:"Bernard Shaw"
我想在C Sharp做这个,我有这个正则表达式:来自SO中另一个帖子的@"(?<="")|\w[\w\s]*(?="")|\w+|""[\w\s]*"""
,它将字符串拆分为
research
library
"not available"
author
"Bernard Shaw"
遗憾的是不符合我的确切要求。
我正在寻找任何可以解决问题的正则表达式。
任何帮助表示感谢。
答案 0 :(得分:27)
只要引用的字符串中没有引用的转义,以下内容应该有效:
splitArray = Regex.Split(subjectString, "(?<=^[^\"]*(?:\"[^\"]*\"[^\"]*)*) (?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)");
此正则表达式仅在空格字符前面和后面跟偶数引号时才会分割。
没有所有那些转义引号的正则表达式解释了:
(?<= # Assert that it's possible to match this before the current position (positive lookbehind):
^ # The start of the string
[^"]* # Any number of non-quote characters
(?: # Match the following group...
"[^"]* # a quote, followed by any number of non-quote characters
"[^"]* # the same
)* # ...zero or more times (so 0, 2, 4, ... quotes will match)
) # End of lookbehind assertion.
[ ] # Match a space
(?= # Assert that it's possible to match this after the current position (positive lookahead):
(?: # Match the following group...
[^"]*" # see above
[^"]*" # see above
)* # ...zero or more times.
[^"]* # Match any number of non-quote characters
$ # Match the end of the string
) # End of lookahead assertion
答案 1 :(得分:3)
你走了:
C#:
Regex.Matches(subject, @"([^\s]*""[^""]+""[^\s]*)|\w+")
正则表达式:
([^\s]*\"[^\"]+\"[^\s]*)|\w+