我已经被困在这里一段时间了,似乎用正则表达式解决了不正确的NEsper行为问题。我写了一个简单的项目来重现这个问题,可以从github获得。
简而言之,NEsper允许我通过一组规则(类似SQL)来抽取消息(事件)。如果事件与规则匹配,则NEsper会触发警报。在我的应用程序中,我需要使用regular expression,这似乎不起作用。
问题
我尝试了creating statements createPattern
和createEPL
的两种方法,但它们没有触发匹配事件,但是正则表达式和输入是由.NET Regex类匹配的。如果不是正则表达式(“\ b \ d {1,3}。\ d {1,3}。\ d {1,3}。\ d {1,3} \ b”)我传递匹配值(“ 127.0.0.5“)声明,事件成功触发。
INPUT 127.0.0.5 ==RULE FAIL== every (Id123=TestDummy(Value regexp '\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b')) // and I want this to pass ==RULE PASS== every (Id123=TestDummy(Value regexp '127.0.0.5'))
问题
有没有人可以帮我解决NEsper正则表达式匹配的问题?或者也许在我的代码中指出我的愚蠢错误。
代码
这是我的NEsper演示包装类
public class NesperAdapter
{
public MatchEventSubscrtiber Subscriber { get; set; }
internal EPServiceProvider Engine { get; private set; }
public NesperAdapter()
{
//This call internally depend on log4net,
//will throw an error if log4net cannot be loaded
EPServiceProviderManager.PurgeDefaultProvider();
//config
var configuration = new Configuration();
configuration.AddEventType("TestDummy", typeof(TestDummy).FullName);
configuration.EngineDefaults.Threading.IsInternalTimerEnabled = false;
configuration.EngineDefaults.Logging.IsEnableExecutionDebug = false;
configuration.EngineDefaults.Logging.IsEnableTimerDebug = false;
//engine
Engine = EPServiceProviderManager.GetDefaultProvider(configuration);
Engine.EPRuntime.SendEvent(new TimerControlEvent(TimerControlEvent.ClockTypeEnum.CLOCK_EXTERNAL));
Engine.Initialize();
Engine.EPRuntime.UnmatchedEvent += OnUnmatchedEvent;
}
public void AddStatementFromRegExp(string regExp)
{
const string pattern = "any (Id123=TestDummy(Value regexp '{0}'))";
string formattedPattern = String.Format(pattern, regExp);
EPStatement statement = Engine.EPAdministrator.CreatePattern(formattedPattern);
//this is subscription
Subscriber = new MatchEventSubscrtiber();
statement.Subscriber = Subscriber;
}
internal void OnUnmatchedEvent(object sender, UnmatchedEventArgs e)
{
Console.WriteLine(@"Unmatched event");
Console.WriteLine(e.Event);
}
public void SendEvent(object someEvent)
{
Engine.EPRuntime.SendEvent(someEvent);
}
}
然后订阅者和DummyType
public class MatchEventSubscrtiber
{
public bool HasEventFired { get; set; }
public MatchEventSubscrtiber()
{
HasEventFired = false;
}
public void Update(IDictionary<string, object> rows)
{
Console.WriteLine("Match event fired");
Console.WriteLine(rows);
HasEventFired = true;
}
}
public class TestDummy
{
public string Value { get; set; }
}
和NUnit测试。 如果有人评论nesper.AddStatementFromRegExp(regexp);行和取消注释//nesper.AddStatementFromRegExp(input);
行然后测试通过。但是我需要一个正则表达式。
//Match any IP address
[TestFixture(@"\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b", "127.0.0.5")]
public class WhenValidRegexpPassedAndRuleCreatedAndPropagated
{
private NesperAdapter nesper;
//Setup
public WhenValidRegexpPassedAndRuleCreatedAndPropagated(string regexp, string input)
{
//check it is valid regexp in .NET
var r = new Regex(regexp);
var match = r.Match(input);
Assert.IsTrue(match.Success, "Regexp validation failed in .NET");
//create and start engine
nesper = new NesperAdapter();
//Add a rule, this fails with a correct regexp and a matching input
//PROBLEM IS HERE
nesper.AddStatementFromRegExp(regexp);
//PROBLEM IS HERE
//This works, but it is just input self-matching
//nesper.AddStatementFromRegExp(input);
var oneEvent = new TestDummy
{
Value = input
};
nesper.SendEvent(oneEvent);
}
[Test]
public void ThenNesperFiresMatchEvent()
{
//wait till nesper process the event
Thread.Sleep(100);
//Check if subscriber has received the event
Assert.IsTrue(nesper.Subscriber.HasEventFired,
"Event didn't fire");
}
}
答案 0 :(得分:1)
我正在调试此问题一段时间后发现NEsper错误地处理了
WHERE regexp 'foobar'
声明
所以,如果我有
SELECT * FROM MyType WHERE PropertyA regexp'some valid regexp'
NEsper使用'some valid regexp'执行字符串格式化和验证,并从regexp中删除重要(和有效)符号。这就是我为自己修复的方法。不确定这是否是推荐的方法。
原因:我认为由用户决定如何构建regexp,这不应该是框架的一部分。
// Inside this method
public object Evaluate(EventBean[] eventsPerStream, bool isNewData, ExprEvaluatorContext exprEvaluatorContext){...}
// Find two occurrences of
_pattern = new Regex(String.Format("^{0}$", patternText));
// And change to
_pattern = new Regex(patternText);
原因:requireUnescape用于所有字符串,但跳过regexp,因为这会制动有效的正则表达式并从中删除一些有效的符号。
// Inside this method
public static Object Parse(ITree node){...}
// Find one occurrence of
case EsperEPL2GrammarParser.STRING_TYPE:
{
return StringValue.ParseString(node.Text, requireUnescape);
}
// And change to
case EsperEPL2GrammarParser.STRING_TYPE:
{
bool requireUnescape = true;
if (node.Parent != null)
{
if (!String.IsNullOrEmpty(node.Parent.Text))
{
if (node.Parent.Text == "regexp")
{
requireUnescape = false;
}
}
}
return StringValue.ParseString(node.Text, requireUnescape);
}
原因:unescape所有字符串,但正则表达式值。
// Inside this method
public static String ParseString(String value){...}
// Change from
public static String ParseString(String value)
{
if ((value.StartsWith("\"")) & (value.EndsWith("\"")) || (value.StartsWith("'")) & (value.EndsWith("'")))
{
if (value.Length > 1)
{
if (value.IndexOf('\\') != -1)
{
return Unescape(value.Substring(1, value.Length - 2));
}
return value.Substring(1, value.Length - 2);
}
}
throw new ArgumentException("String value of '" + value + "' cannot be parsed");
}
// Change to
public static String ParseString(String value, bool requireUnescape = true)
{
if ((value.StartsWith("\"")) & (value.EndsWith("\"")) || (value.StartsWith("'")) & (value.EndsWith("'")))
{
if (value.Length > 1)
{
if (requireUnescape)
{
if (value.IndexOf('\\') != -1)
{
return Unescape(value.Substring(1, value.Length - 2));
}
}
return value.Substring(1, value.Length - 2);
}
}
throw new ArgumentException("String value of '" + value + "' cannot be parsed");
}