在.NET中以编程方式解析日志文件

时间:2010-07-14 21:21:02

标签: .net parsing logging log4net text-processing

我们有大量(读取:50,000)相对较小(读取低于500K,通常低于50K)的日志文件,这些日志文件是使用我们客户端应用程序中的log4net创建的。典型的日志如下:

Start Painless log
Framework:8.1.7.0
Application:8.1.7.0
2010-05-05 19:26:07,678 [Login ] INFO  Application.App.OnShowLoginMessage(194) - Validating Credentials...
2010-05-05 19:26:08,686 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Checking for Application Updates...
2010-05-05 19:26:08,830 [1     ] INFO  Framework.Globals.InstanceStartup(132) - Application Startup
2010-05-05 19:26:09,293 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Purchase History Data>:True
2010-05-05 19:26:09,293 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Shopping Assistant>:True
2010-05-05 19:26:09,294 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Shopping List>:True
2010-05-05 19:26:09,294 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Teeth>:True
2010-05-05 19:26:09,294 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Scanner>:True
2010-05-05 19:26:09,294 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Value Comparison>:True
2010-05-05 19:26:09,294 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Lotus Notes CRM>:True
2010-05-05 19:26:09,295 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Salesforce.com>:False
2010-05-05 19:26:09,295 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Lotus Notes Mail>:True
2010-05-05 19:26:09,295 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Sales Leads>:True
2010-05-05 19:26:09,295 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Configurator>:True
2010-05-05 19:26:09,297 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Validating Database...
2010-05-05 19:26:10,342 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Validating Database...
2010-05-05 19:26:10,489 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Global Handlers...
2010-05-05 19:26:10,495 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Starting Main Window...
2010-05-05 19:26:10,496 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Initializing Components...
2010-05-05 19:26:11,145 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Restoring Location...
2010-05-05 19:26:11,164 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Plug Ins...
2010-05-05 19:26:11,169 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Panels...Order Manager
2010-05-05 19:26:11,181 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Orders...
2010-05-05 19:26:11,274 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Done Loading 1 Order
2010-05-05 19:26:11,314 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Panels...All Products
2010-05-05 19:26:11,471 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Purchase History Data
2010-05-05 19:26:11,545 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Shopping List
2010-05-05 19:26:11,597 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Teeth
2010-05-05 19:26:11,768 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Scanner
2010-05-05 19:26:11,810 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Value Comparison
2010-05-05 19:26:11,840 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Sales Leads
2010-05-05 19:26:11,922 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...(Done!)
2010-05-05 19:26:11,923 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Adding Handlers...
2010-05-05 19:26:11,925 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Main Window Handlers...
2010-05-05 19:26:11,932 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Main Menu...
2010-05-05 19:26:11,949 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Initialization Complete.
2010-05-05 19:26:13,662 [1     ] INFO  Framework.ProductSearch.Search(342) - User entered term: <>

我希望能够解析这些日志服务器端(无论是上传还是每晚),以提取异常(始终记录为ERROR或FATAL)或其他特定日志消息,如:

2010-05-05 20:05:24,951 [1     ] INFO  Framework.ProductSearch.Search(342) - User entered term: <kavo>

获取“kavo”字词,以便我们了解人们真正搜索的内容。

我尝试过手工解析文本(使用String.Split()和类似的方法)但是我觉得我正在重新发明轮子。

是否有一个很好的库可以进行这种日志提取?

4 个答案:

答案 0 :(得分:4)

您可以使用Microsoft的Log ParserDownload here)。该工具还公开了一个COM接口,允许您以编程方式访问结果。 This blog post有一个小的(部分)指令,用于2.1版本。

所以现在不要觉得它正在重新发明轮子,你可能觉得它过度设计了。 ;)

答案 1 :(得分:1)

我们最终只是输出XML作为我们的日志格式,解析起来变得微不足道。

答案 2 :(得分:0)

这看起来像是.NET regular expressions

的典型案例

这样您就可以搜索“用户输入的术语:”等模式。您可以在之后提取组匹配

答案 3 :(得分:0)

log4net可以不读取它自己的格式模式吗?一旦它重新读取格式,你就不能解析%message%section吗?

一个更好的问题也许是为什么你不只是在进行日志调用的同时将该搜索词写入数据库,或者通过log4net创建另一个日志文件,只记录%message%和你写入该日志的所有内容是搜索词吗?如果你能避免它,我真的不认为你想要解析日志......至少可以继续前进。