我需要根据连接的单词在数组中拆分几个字符串,即on,in,from等。
string sampleString = "what was total sales for pencils from Japan in 1999";
期望的结果:
what was total sales
for pencils
from japan
in 1999
我熟悉基于一个单词而不是多个单词分割字符串:
string[] stringArray = sampleString.Split(new string[] {"of"}, StringSplitOptions.None);
有什么建议吗?
答案 0 :(得分:5)
对于此特定方案,您可以使用正则表达式执行此操作。
你必须使用一种称为超前模式的东西,否则你将会从结果中删除你要拆分的单词。
这是一个小型LINQPad程序,用于演示:
void Main()
{
string sampleString = "what was total sales for pencils from Japan in 1999";
Regex.Split(sampleString, @"\b(?=of|for|in|from)\b").Dump();
}
输出:
what was total sales
for pencils
from Japan
in 1999
但是,正如我在评论中所说,它会被包含你所分割的任何单词的地方名称所绊倒,所以:
string sampleString = "what was total sales for pencils from the Isle of Islay in 1999";
Regex.Split(sampleString, @"\b(?=of|for|in|from)\b").Dump();
输出:
what was total sales
for pencils
from the Isle
of Islay
in 1999
正则表达式可以像这样重写,以便在将来的维护中更具表现力:
Regex.Split(sampleString, @"
\b # Must be a word boundary here
# makes sure we don't match words that contain the split words, like 'fortune'
(?= # lookahead group, will match, but not be consumed/zero length
of # List of words, separated by the OR operator, |
|for
|in
|from
)
\b # Also a word boundary", RegexOptions.IgnorePatternWhitespace).Dump();
您可能还想在选项中添加RegexOptions.IgnoreCase
,以匹配" Of"和" OF"等。