在Google中搜索邮件时,我们使用sytax
from:devcoder hasattachments:true mySearchString on:11-aug
或
mySearchString from:devcoder on:11-aug anotherSearchKeyword
解析之后,我应该得到keyvalue对,例如(from,devcoder),(on,11-aug)。 在c#中实现此解析的最佳方法是什么。
答案 0 :(得分:19)
对Linq-ify Jason的回答:
string s = "from:devcoder hasattachments:true mySearchString on:11-aug";
var keyValuePairs = s.Split(' ')
.Select(x => x.Split(':'))
.Where(x => x.Length == 2)
.ToDictionary(x => x.First(), x => x.Last());
答案 1 :(得分:5)
按空格拆分,然后对于拆分的每个组件,将其拆分为:
。然后继续进行。大致是:
string s = "from:devcoder hasattachments:true mySearchString on:11-aug";
var components = s.Split(' ');
var blocks = components.Select(component => component.Split(':'));
foreach(var block in blocks) {
if(block.Length == 1) {
Console.WriteLine("Found {0}", block[0]);
}
else {
Console.WriteLine(
"Found key-value pair key = {0}, value = {1}",
block[0],
block[1]
);
}
}
输出:
Found key-value pair key = from, value = devcoder
Found key-value pair key = hasattachments, value = true
Found mySearchString
Found key-value pair key = on, value = 11-aug
第二个字符串的输出:
Found mySearchString
Found key-value pair key = from, value = devcoder
Found key-value pair key = on, value = 11-aug
Found anotherSearchKeyword
答案 2 :(得分:5)
这是我过去使用的一种基于正则表达式的方法;它支持前缀和带引号的字符串。
更正确/更健壮/更高效的方法将涉及编写一个简单的解析器,但是在大多数使用场景中,与实现和测试解析器相关的时间和精力将与收益大不相称。
private static readonly Regex searchTermRegex = new Regex(
@"^(
\s*
(?<term>
((?<prefix>[a-zA-Z][a-zA-Z0-9-_]*):)?
(?<termString>
(?<quotedTerm>
(?<quote>['""])
((\\\k<quote>)|((?!\k<quote>).))*
\k<quote>?
)
|(?<simpleTerm>[^\s]+)
)
)
\s*
)*$",
RegexOptions.Compiled | RegexOptions.Singleline | RegexOptions.IgnorePatternWhitespace | RegexOptions.ExplicitCapture
);
private static void FindTerms(string s) {
Console.WriteLine("[" + s + "]");
Match match = searchTermRegex.Match(s);
foreach(Capture term in match.Groups["term"].Captures) {
Console.WriteLine("term: " + term.Value);
Capture prefix = null;
foreach(Capture prefixMatch in match.Groups["prefix"].Captures)
if(prefixMatch.Index >= term.Index && prefixMatch.Index <= term.Index + term.Length) {
prefix = prefixMatch;
break;
}
if(null != prefix)
Console.WriteLine("prefix: " + prefix.Value);
Capture termString = null;
foreach(Capture termStringMatch in match.Groups["termString"].Captures)
if(termStringMatch.Index >= term.Index && termStringMatch.Index <= term.Index + term.Length) {
termString = termStringMatch;
break;
}
Console.WriteLine("termString: " + termString.Value);
}
Console.WriteLine();
}
public static void Main (string[] args)
{
FindTerms(@"two terms");
FindTerms(@"prefix:value");
FindTerms(@"some:""quoted term""");
FindTerms(@"firstname:Jack ""the Ripper""");
FindTerms(@"'quoted term\'s escaped quotes'");
FindTerms(@"""unterminated quoted string");
}
输出:
[two terms]
term: two
termString: two
term: terms
termString: terms
[prefix:value]
term: prefix:value
prefix: prefix
termString: value
[some:"quoted term"]
term: some:"quoted term"
prefix: some
termString: "quoted term"
[firstname:Jack "the Ripper"]
term: firstname:Jack
prefix: firstname
termString: Jack
term: "the Ripper"
termString: "the Ripper"
['quoted term\'s escaped quotes']
term: 'quoted term\'s escaped quotes'
termString: 'quoted term\'s escaped quotes'
["unterminated quoted string]
term: "unterminated quoted string
termString: "unterminated quoted string
答案 3 :(得分:1)
空间上的第一个Split()
,然后你有一个包含所有搜索词的数组。然后循环遍历它们以在冒号上再次找到Contains()
冒号(:)和Split()
的那些。