Question

尝试解决另一个SO question，我提出了以下算法，我认为这个算法非常优化。但是，在所有解决方案上运行DotNetBenchmark时，我感到非常惊讶的是，我的代码运行的平均值387 ms高达~ 20-30 ms，其他一些答案实现了[MethodImpl(MethodImplOptions.AggressiveInlining)] int CalcMe(string input) // I used Marc Gravel's input generation method { var operands = input.Split(' '); var j = 1; // operators index var result = int.Parse(operands[0]); // output // i = numbers index for (int i = 2; i < operands.Length; i += 2) { switch (operands[j]) { case "+": result += int.Parse(operands[i]); break; case "-": result -= int.Parse(operands[i]); break; case "*": result *= int.Parse(operands[i]); break; case "/": try { result /= int.Parse(operands[i]); break; } catch { break; // division by 0. } default: throw new Exception("Unknown Operator"); } j += 2; // next operator } return result; }。

String.Split()

只需将Main()提取到调用者110 ms方法，我就将执行速度降低到switch，但由于所有其他答案都直接处理输入，因此仍无法解决问题。

我只是想了解或许改变我对优化的思考方式。我看不到任何我只使用的关键字。 for，int.Parse()和static string GenerateInput() { Random rand = new Random(12345); StringBuilder input = new StringBuilder(); string operators = "+-*/"; var lastOperator = '+'; for (int i = 0; i < 1000000; i++) { var @operator = operators[rand.Next(0, 4)]; input.Append(rand.Next(lastOperator == '/' ? 1 : 0, 100) + " " + @operator + " "); lastOperator = @operator; } input.Append(rand.Next(0, 100)); return input.ToString(); }几乎都是其他解决方案。

编辑1：测试输入生成 输入生成从原始quetsion上的Marc回复复制如下：

exec dbms_output.enable(32767);
set serveroutput on
DECLARE
    data_buffer VARCHAR2(32767);
BEGIN
    SELECT '<BGN>' || CLOBDATA_VALUE || '<END>' into data_buffer 
    FROM DUMMY_TABLE
    WHERE ID='DUMMY_ID';
    dbms_output.put_line(data_buffer);
EXCEPTION
    when no_data_found then
    dbms_output.put_line('<BGN>no rows selected<END>');
END;

Answer 1

[MethodImpl(MethodImplOptions.AggressiveInlining)]

在这里几乎没有任何成就。当您想告诉编译器只是将代码复制并粘贴到多个位置以避免不必要的方法调用时，使用内联。在大多数情况下，知道什么时候自己做这件事，真是太聪明了。

var operands = input.Split(' ');

使JIT遍历整个字符串，进行搜索，拆分字符串并填充数组，这可能需要很长时间。

switch (operands[j])

打开字符串也会产生影响，因为它必须在案例上调用equals。如果你正在查看性能（例如char），你想在switch中使用简单类型。

int.Parse

这实际上做了很多分配，甚至处理不安全的代码。您可以在此处查看解析代码：

https://referencesource.microsoft.com/#mscorlib/system/number.cs,698

或者如果链接断开：

[System.Security.SecuritySafeCritical]  // auto-generated
internal unsafe static Int32 ParseInt32(String s, NumberStyles style, NumberFormatInfo info) {

    Byte * numberBufferBytes = stackalloc Byte[NumberBuffer.NumberBufferBytes];
    NumberBuffer number = new NumberBuffer(numberBufferBytes);
    Int32 i = 0;

    StringToNumber(s, style, ref number, info, false);

    if ((style & NumberStyles.AllowHexSpecifier) != 0) {
        if (!HexNumberToInt32(ref number, ref i)) { 
            throw new OverflowException(Environment.GetResourceString("Overflow_Int32"));
        }
    }
    else {
        if (!NumberToInt32(ref number, ref i)) {
            throw new OverflowException(Environment.GetResourceString("Overflow_Int32"));
        }
    }
    return i;           
}

[System.Security.SecuritySafeCritical]  // auto-generated
private unsafe static void StringToNumber(String str, NumberStyles options, ref NumberBuffer number, NumberFormatInfo info, Boolean parseDecimal) {

    if (str == null) {
        throw new ArgumentNullException("String");
    }
    Contract.EndContractBlock();
    Contract.Assert(info != null, "");
    fixed (char* stringPointer = str) {
        char * p = stringPointer;
        if (!ParseNumber(ref p, options, ref number, null, info , parseDecimal) 
                || (p - stringPointer < str.Length && !TrailingZeros(str, (int)(p - stringPointer)))) {
            throw new FormatException(Environment.GetResourceString("Format_InvalidString"));
        }
    }
}

Answer 2

我认为比较字符串比比较字符要复杂得多

低于关键差异

switch (operands[j])
{
    case "+":
        ...

switch (cOperator)
{
    case '+':
       ...

Answer 3

有趣的问题！我有兴趣为自己实现这个，并检查我能想出什么，以及它与其他实现的比较。我在F＃中做到了，但由于F＃和C＃都是强类型的CLR语言，并且下面获得的见解（可以说）独立于C＃，我希望你们同意以下内容并不完全偏离主题。 / p>

首先，我需要一些函数来创建一个合适的表达式字符串（改编自你的发布），测量时间，并用生成的字符串运行一堆函数：

module Testbed =
    let private mkTestCase (n : int) =
        let next (r : System.Random) i = r.Next (0, i)
        let r = System.Random ()
        let s = System.Text.StringBuilder n
        let ops = "+-*/"
        (s.Append (next r 100), {1 .. n})
        ||> Seq.fold (fun s _ ->
            let nx = next r 100
            let op = ops.[next r (if nx = 0 then 3 else 4)]
            s.Append (" " + string op + " " + string nx))
        |> string

    let private stopwatch n f =
        let mutable r = Unchecked.defaultof<_>
        let sw = System.Diagnostics.Stopwatch ()
        sw.Start ()
        for i = 1 to n do r <- f ()
        sw.Stop ()
        (r, sw.ElapsedMilliseconds / int64 n)

    let runtests tests =
        let s, t = stopwatch 100 (fun () -> mkTestCase 1000000)
        stdout.Write ("MKTESTCASE\nTime: {0}ms\n", t)
        tests |> List.iter (fun (name : string, f) ->
            let r, t = stopwatch 100 (fun () -> f s)
            let w = "{0} ({1} chars)\nResult: {2}\nTime: {3}ms\n"
            stdout.Write (w, name, s.Length, r, t))

对于一个包含100万次操作的字符串（大约490万个字符），mkTestCase函数在我的笔记本电脑上运行了317毫秒。

接下来，我将您的功能翻译为F＃：

module MethodsToTest =
    let calc_MBD1 (s : string) =
        let inline runop f a b =
            match f with
            | "+" -> a + b
            | "-" -> a - b
            | "*" -> a * b
            | "/" -> a / b
            | _ -> failwith "illegal op"
        let rec loop (ops : string []) r i j =
            if i >= ops.Length then r else
                let n = int ops.[i]
                loop ops (runop ops.[j] r n) (i + 2) (j + 2)
        let ops = s.Split ' '
        loop ops (int ops.[0]) 2 1

这在我的笔记本电脑上运行了488毫秒。

接下来我想检查字符串匹配是否真的比字符匹配慢得多：

    let calc_MBD2 (s : string) =
        let inline runop f a b =
            match f with
            | '+' -> a + b
            | '-' -> a - b
            | '*' -> a * b
            | '/' -> a / b
            | _ -> failwith "illegal op"
        let rec loop (ops : string []) r i j =
            if i >= ops.Length then r else
                let n = int ops.[i]
                loop ops (runop ops.[j].[0] r n) (i + 2) (j + 2)
        let ops = s.Split ' '
        loop ops (int ops.[0]) 2 1

普遍的看法是，字符匹配应该明显更快，因为它只涉及原始比较而不是计算哈希值，但上面的笔记本电脑在482ms内运行，所以原始字符比较和比较哈希之间的区别长度为1的字符串几乎可以忽略不计。

最后，我检查了手动滚动数字解析是否会带来显着的节省：

    let calc_MBD3 (s : string) =
        let inline getnum (c : char) = int c - 48
        let parse (s : string) =
            let rec ploop r i =
                if i >= s.Length then r else
                    let c = s.[i]
                    let n = if c >= '0' && c <= '9'
                            then 10 * r + getnum c else r
                    ploop n (i + 1)
            ploop 0 0
        let inline runop f a b =
            match f with
            | '+' -> a + b
            | '-' -> a - b
            | '*' -> a * b
            | '/' -> a / b
            | _ -> failwith "illegal op"
        let rec loop (ops : string []) r i j =
            if i >= ops.Length then r else
                let n = parse ops.[i]
                loop ops (runop ops.[j].[0] r n) (i + 2) (j + 2)
        let ops = s.Split ' '
        loop ops (parse ops.[0]) 2 1

这在我的笔记本电脑上运行了361ms，因此保存很重要但功能仍然比我自己的创建慢一个数量级（见下文），从而得出结论：初始字符串拆分占用了大部分时间

为了便于比较，我还从您引用F＃的帖子中翻译了OP的功能：

    let calc_OP (s : string) =
        let operate r op x =
            match op with
            | '+' -> r + x
            | '-' -> r - x
            | '*' -> r * x
            | '/' -> r / x
            | _ -> failwith "illegal op"
        let rec loop c n r =
            if n = -1 then
                operate r s.[c + 1] (int (s.Substring (c + 3)))
            else
                operate r s.[c + 1] (int (s.Substring (c + 3, n - (c + 2))))
                |> loop n (s.IndexOf (' ', n + 4))
        let c = s.IndexOf ' '
        loop c (s.IndexOf (' ', c + 4)) (int (s.Substring (0, c)))

这在我的笔记本电脑上跑了238毫秒，因此使用子串并不像分割字符串那么慢，但它仍然远非最佳状态。

最后我自己实现了一个表达式解释器，考虑到最快的处理方式是逐个字符地手动执行，只迭代字符串一次，以及堆分配（通过创建新对象，如字符串）或者数组）应尽可能避免在循环内：

    let calc_Dumetrulo (s : string) =
        let inline getnum (c : char) = int c - 48
        let inline isnum c = c >= '0' && c <= '9'
        let inline isop c =
            c = '+' || c = '-' || c = '*' || c = '/'
        let inline runop f a b =
            match f with
            | '+' -> a + b
            | '-' -> a - b
            | '*' -> a * b
            | '/' -> a / b
            | _ -> failwith "illegal op"
        let rec parse i f a c =
            if i >= s.Length then
                if c = -1 then a else runop f a c
            else
                let k, j = s.[i], i + 1
                if isnum k then
                    let n = if c = -1 then 0 else c
                    parse j f a (10 * n + getnum k)
                elif isop k then parse j k a c
                elif c = -1 then parse j f a c
                else parse j f (runop f a c) -1
        parse 0 '+' 0 -1

我的笔记本电脑上运行了28毫秒。您可以在C＃中以相同的方式表达，除了尾递归，它应该由for或while循环表示：

    static int RunOp(char op, int a, int b)
    {
        switch (op)
        {
            case '+': return a + b;
            case '-': return a - b;
            case '*': return a * b;
            case '/': return a / b;
            default: throw new InvalidArgumentException("op");
        }
    }

    static int Calc_Dumetrulo(string s)
    {
        int a = 0, c = -1;
        char op = '+';
        for (int i = 0; i < s.Length; i++)
        {
            char k = s[i];
            if (k >= '0' && k <= '9')
                c = (c == -1 ? 0 : 10 * c) + ((int)k - 48);
            else if (k == '+' || k == '-' || k == '*' || k == '/')
                op = k;
            else if (c == -1) continue;
            else
            {
                a = RunOp(op, a, c);
                c = -1;
            }
        }
        if (c != -1) a = RunOp(op, a, c);
        return a;
    }

这段代码花了太长时间？

3 个答案: