我有一些搜索查询:
乔治而不是华盛顿或亚伯拉罕
狗或猫而不是狼
对于这些搜索,我希望获得 George或Abraham但不是华盛顿等的结果。
基本上我想获取字符串并能够向我的全文目录存储过程搜索提交上下文搜索。
我假设我应该使用正则表达式,但我对C#中的Regex非常不熟悉。
我发现这篇文章:http://support.microsoft.com/kb/246800我认为这是我需要做的,但我希望我可以对实施有所帮助。
假设您将字符串作为参数并希望返回一个字符串:
string input = 'George Washington AND NOT Martha OR Dog';
private string interpretSearchQuery(input)
{
// HALP!
/* replace ' AND ' | ' AND NOT ' with
* " AND "
* " AND NOT "
*
* replace ' OR ' | ' OR NOT ' with
* " OR "
* " OR NOT "
*
* add " to beginning of string and " to end of string
*/
return '"George Washington" AND NOT "Martha" OR "Dog"';
}
答案 0 :(得分:4)
我会使用Postfix notation(或波兰表示法)解析你的字符串。
**Postfix algorithm**
The algorithm for evaluating any postfix expression is fairly straightforward:
While there are input tokens left
Read the next token from input.
If the token is a value
Push it onto the stack.
Otherwise, the token is an operator (operator here includes both operators, and functions).
It is known a priori that the operator takes n arguments.
If there are fewer than n values on the stack
(Error) The user has not input sufficient values in the expression.
Else, Pop the top n values from the stack.
Evaluate the operator, with the values as arguments.
Push the returned results, if any, back onto the stack.
If there is only one value in the stack
That value is the result of the calculation.
If there are more values in the stack
(Error) The user input has too many values.
所以拿你的输入字符串:
'乔治华盛顿而不是玛莎或者 狗'
将其简化为:
A = George
B = Washington
C = Martha
D = Dog
& = AND
! = NOT
| = OR
我们会得到
的后缀表示法AB&安培;!çd |
这意味着:
答案 1 :(得分:3)
这可能会让你开始......我会重构这个废话以使其更加强大。
string input = "George Washington AND NOT Martha OR Dog";
private string interpretSearchQuery(string input)
{
StringBuilder builder = new StringBuilder();
var tokens = input.Split( ' ' );
bool quoteOpen = false;
foreach( string token in tokens )
{
if( !quoteOpen && !IsSpecial( token ) )
{
builder.AppendFormat( " \"{0}", token );
quoteOpen = true;
}
else if( quoteOpen && IsSpecial( token ))
{
builder.AppendFormat( "\" {0}", token );
quoteOpen = false;
}
else
{
builder.AppendFormat( " {0}", token );
}
}
if( quoteOpen )
{
builder.Append( "\"" );
}
return "'" + builder.ToString().Trim() + "'";
}
public static bool IsSpecial( string token )
{
return string.Compare( token, "AND", true ) == 0 ||
string.Compare( token, "OR", true ) == 0 ||
string.Compare( token, "NOT", true ) == 0;
}
答案 2 :(得分:0)
这是我提出的解决方案。唯一的问题是格式错误的搜索查询无法正确解析并失败:
private string interpretSearchTerm(string searchTerm)
{
string term = "";
/* replace ' AND ' | ' AND NOT ' with
* " AND "
* " AND NOT "
*
* replace ' OR ' | ' OR NOT ' with
* " OR "
* " OR NOT "
*
* add " to beginning of string and " to end of string
*/
if (searchTerm.IndexOf("AND") > -1
|| searchTerm.IndexOf("OR") > -1
|| searchTerm.IndexOf("AND NOT") > -1
|| searchTerm.IndexOf("OR NOT") > -1)
{
term = searchTerm.Replace(" AND NOT ", "\"AND NOT\"")
.Replace(" AND ", "\"AND\"")
.Replace(" OR NOT", "\"OR NOT\"")
.Replace(" OR ", "\"OR\"");
term = "\"" + term + "\"";
return term;
}
else if (searchTerm.IndexOf("\"") > -1) return searchTerm;
else return "\"" + searchTerm + "\"";
}
我现在将实现GalacticJello建议的后缀算法。当我开始工作时,我会发布它。