我是regex的新手。我有这个字符串
new.TITLE['kinds.of'].food
或
new.TITLE['deep thought'].food
我想要检索这些令牌:
new, TITLE, kinds.of, food.
或(第二个例子)
new, TITLE, deep thought, food.
我不能简单地拆分与'.'
我需要正则表达式匹配来获取值。
怎么做?
答案 0 :(得分:1)
使用令牌时 解析器(在这种情况下, FST - 有限状态机)应该:
private static IEnumerable<string> ParseIt(string value) {
int lastIndex = 0;
bool inApostroph = false;
for (int i = 0; i < value.Length; ++i) {
char ch = value[i];
if (ch == '\'') {
inApostroph = !inApostroph;
continue;
}
if (inApostroph)
continue;
if (ch == '.' || ch == ']' || ch == '[') {
if (i - lastIndex > 0) {
if (value[lastIndex] != '\'')
yield return value.Substring(lastIndex, i - lastIndex);
else {
string result = value.Substring(lastIndex, i - lastIndex).Replace("''", "'");
yield return result.Substring(1, result.Length - 2);
}
}
lastIndex = i + 1;
}
}
if (lastIndex < value.Length)
yield return value.Substring(lastIndex);
}
试验:
string test1 = @"new.TITLE['kinds.of'].food";
string test2 = @"new.TITLE['deep thought'].food";
string[] result1 = ParseIt(test1).ToArray();
string[] result2 = ParseIt(test2).ToArray();
Console.WriteLine(string.Join(Environment.NewLine, result1));
Console.WriteLine(string.Join(Environment.NewLine, result2));
结果:
new
TITLE
kinds.of
food
new
TITLE
deep thought
food