我正在编写一个SysLog服务器,我的程序接收RFC5424Format的消息。
我的程序必须解析消息并存储值。
我有一个正则表达式,无法解析该消息。
正则表达式存在问题。我是正则表达式的新手。
任何帮助表示感谢。
public static void Main()
{
string RFC5424Format = @"(\<(?<PRI>\d+)\>(?<VERSION>\d+)?)? \ * (?<TIMESTAMP> ( (?<YEAR>\d+) - (?<MONTH>\d+) - (?<DAY>\d+) ) T+ (?<HOUR>\d+): (?<MINUTE>\d+): (?<SECOND>\d+) (\.(?<MILLISECONDS>\d+))? (?<OFFSET>Z|(\+|\-)\d+:\d+)? ) \ (?<HOSTNAME>[\w!-~]+) \ (?<APPNAME>[\w!-~]+) \ (?<PROCID>[\w!-~]+) \ (?<MSGID>[\w!-~]+) \ (?<SD>-|(\[.*\])) \ ?(?<MESSAGE>.*)?";
Regex rfc5424 = new Regex("^" + RFC5424Format + "$", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.IgnorePatternWhitespace);
string input = "< 38 > 1 2018 - 03 - 01T16: 05:51.799465 + 05:30 AAEINBLR07229L Source_UDP - -\n ??? MessageContent_Via_UDP - 5424";
Match m = rfc5424.Match(input);
if (m.Success)
{
Console.WriteLine("Regex is fine");
}
else
{
Console.WriteLine("Problem in Regex");
}
}
答案 0 :(得分:1)
我最近才遇到这个问题。根据{{3}},Syslog消息应采用以下格式:HEADER SP STRUCTURED-DATA [SP MSG],其中SP是空格字符,方括号表示数据是可选的。话虽如此,我发现更容易将消息分解为三个单独的正则表达式模式,然后在实例化RFC 5424对象进行比较时将它们组合在一起。
这是我的示例课。希望对您有所帮助。
public class SyslogMessage
{
private static readonly string _SyslogMsgHeaderPattern = @"\<(?<PRIVAL>\d{1,3})\>(?<VERSION>[1-9]{0,2}) (?<TIMESTAMP>(\S|\w)+) (?<HOSTNAME>-|(\S|\w){1,255}) (?<APPNAME>-|(\S|\w){1,48}) (?<PROCID>-|(\S|\w){1,128}) (?<MSGID>-|(\S|\w){1,32})";
private static readonly string _SyslogMsgStructuredDataPattern = @"(?<STRUCTUREDDATA>-|\[[^\[\=\x22\]\x20]{1,32}( ([^\[\=\x22\]\x20]{1,32}=\x22.+\x22))?\])";
private static readonly string _SyslogMsgMessagePattern = @"( (?<MESSAGE>.+))?";
private static Regex _Expression = new Regex($@"^{_SyslogMsgHeaderPattern} {_SyslogMsgStructuredDataPattern}{_SyslogMsgMessagePattern}$", RegexOptions.None, new TimeSpan(0, 0, 5));
public int Prival { get; private set; }
public int Version { get; private set; }
public DateTime TimeStamp { get; private set; }
public string HostName { get; private set; }
public string AppName { get; private set; }
public string ProcId { get; private set; }
public string MessageId { get; private set; }
public string StructuredData { get; private set; }
public string Message { get; private set; }
public string RawMessage { get; private set; }
/// <summary>
/// Parses a Syslog message in RFC 5424 format.
/// </summary>
/// <exception cref="FormatException"></exception>
/// <exception cref="OverflowException"></exception>
/// <exception cref="ArgumentNullException"></exception>
/// <exception cref="InvalidOperationException"></exception>
public static SyslogMessage Parse(string rawMessage)
{
if (string.IsNullOrWhiteSpace(rawMessage)) { throw new ArgumentNullException("message"); }
var match = _Expression.Match(rawMessage);
if (match.Success)
{
return new SyslogMessage
{
Prival = Convert.ToInt32(match.Groups["PRIVAL"].Value),
Version = Convert.ToInt32(match.Groups["VERSION"].Value),
TimeStamp = Convert.ToDateTime(match.Groups["TIMESTAMP"].Value),
HostName = match.Groups["HOSTNAME"].Value,
AppName = match.Groups["APPNAME"].Value,
ProcId = match.Groups["PROCID"].Value,
MessageId = match.Groups["MSGID"].Value,
StructuredData = match.Groups["STRUCTUREDDATA"].Value,
Message = match.Groups["MESSAGE"].Value,
RawMessage = rawMessage
};
}
else { throw new InvalidOperationException("Invalid message."); }
}
public override string ToString()
{
var message = new StringBuilder($@"<{Prival:###}>{Version:##} {TimeStamp.ToString("yyyy-MM-ddTHH:mm:ss.fffK")} {HostName} {AppName} {ProcId} {MessageId} {StructuredData}");
if (!string.IsNullOrWhiteSpace(Message))
{
message.Append($" {Message}");
}
return message.ToString();
}
}
答案 1 :(得分:0)
正则表达式总是需要大量的思考和测试才能做到正确。我没时间解决这个问题,但我可以告诉你,你几乎走在正确的轨道上。我已经重写并测试了部分正则表达式(没有锚定),并将其包含在此作为参考:
\< *(?<PRI>\d+) *\> *(?<VERSION>\d+)? *(?<YEAR>\d+) - (?<MONTH>\d+) - (?<DAY>\d+)T(?<HOUR>\d+): *(?<MINUTE>\d+):(?<SECOND>\d+)\.(?<MILLISECONDS>\d+) *\+ *(?<OFFSET>\d+:\d+) *(?<HOSTNAME>\b\w+\b) *(?<SOURCE>\b\w+\b)
我总是发现https://www.regexpal.com/有助于调试Regex问题。慢慢来,一步一步走。让我知道它是如何运作的!