我在解析具有空格值的特定字段的日志时遇到困难?

时间:2013-03-04 07:42:57

标签: c#

我想在“GSA搜索”中计算总“经过时间”,但我面临着差异。  在某些情况下,格式是“经过时间:97毫秒”,在某些情况下,它是“经过时间:97毫秒”。我该如何涵盖这两种情况?

以下是我的日志文件格式:

WX Search = Server:nomos-scanner.corp.com User:vibsharm appGUID: wx Elapsed Time: 975ms SaveSearchID:361
WX Search = Server:nomos-scanner.corp.com User:vibsharm appGUID: wx Elapsed Time: 875ms SaveSearchID:361
GSA Search = Server:nomos-scanner.corp.com User:gulanand appGUID: wx Elapsed Time:890ms SaveSearchID:361
GSA Search = Server:nomos-scanner.corp.com User:vibsharm appGUID: wx Elapsed Time:887ms SaveSearchID:361
GSA Search = Server:nomos-scanner.corp.com User: gulanand appGUID: wx Elapsed Time: 875.5ms SaveSearchID:361
GSA Search = Server:nomos-scanner.corp.com User:vibsharm appGUID: wx Elapsed Time:877.6ms SaveSearchID:361

我的代码:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.IO;
using System.Linq.Expressions;

namespace ConsoleApplication5
{
    class Program
    {
        public static void Main(string[] args)
        {
            string searchKeyword = "WX GSA Search";
            string fileName = @"C:\Users\karan\Desktop\Sample log file.txt";
            string[] textLines = File.ReadAllLines(fileName);

            List<string> results = new List<string>();

            foreach (string line in textLines)
            {
                if (line.Contains(searchKeyword))
                {
                    results.Add(line);
                }
            }
            var elapsedTime = results.SelectMany(line => line.ToLower().Split(' '))
            .Where(line => line.StartsWith("time"))
            .Select(timeLine => decimal.Parse(timeLine.Split(':')[1].Replace("ms", String.Empty)))
            .Average(time => time);
            Console.WriteLine(elapsedTime);
            // keep screen from going away
            // when run from VS.NET
            Console.ReadLine();
            }  
    }  
}

2 个答案:

答案 0 :(得分:0)

        string x = @"Elapsed Time: 97ms";
        int startIndex = x.LastIndexOf("Elapsed Time");
        int endIndex = x.LastIndexOf("ms");
        //Here there might be a problem, you might need to change to endIndex - startIndex +1
        string valueSubString = x.Substring(startIndex, endIndex - startIndex);
        decimal value = decimal.Parse(valueSubString.Replace(':').Trim());

答案 1 :(得分:0)

如果减少字符串分配,可以加快速度,如下所示:

(理想情况下,我完全使用有限状态机解析器,但也可以逐行读取):

Decimal totalMs = 0;
Int32 totalRecords = 0;

using(StringReader rdr = new StreamReader(fileName, Encoding.UTF8)) {

    String line;
    while( (line = rdr.ReadLine()) != null ) {

        if( line.IndexOf("GSA Search", StringComparison.OrdinalIgnoreCase ) != 0 ) continue;

        Int32 idxStart = line.IndexOf("Elapsed time:", StringComparison.OrdinalIgnoreCase ) + "Elapsed time:".Length;
        Int32 idxEnd   = line.IndexOf("ms", idxStart, StringComparison.OrdinalIgnoreCase );

        Decimal lineMs = Decimal.Parse( line.Substring( idxStart, idxEnd - idxStart ) );
        totalMs += lineMs;
        totalRecords++;
    }
}
Decimal average = totalMs / totalRecords;

调用.Trim是不必要的,因为Decimal.Parse允许使用空格,但如果您希望更加健壮,可以考虑使用Decimal.TryParse