为什么Java String.indexOf()优于用户定义类中实现的相同逻辑?

时间:2014-06-17 05:20:54

标签: java string performance

我对Java的String.indexOf(String subString)的性能有疑问。

我编写了一个类来比较调用String.indexOf(String subString)的性能与从String的源内部复制源并使用完全相同的参数调用内部indexOf()。

直接调用String.indexOf()时,性能似乎提高了约4倍,尽管调用堆栈的深度为2帧。

我的JVM是JDK1.7.0_40 64位(windows热点)。 我的机器正在运行带有i7-4600U CPU和16GB内存的Windows。

以下是代码:

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicLong;

public class TestIndexOf implements Runnable {

  final static String s0 = "This is my search string, it is pretty long so can test the speed of the search";
  final static String s1 = "speed of the search";
  final static char[] c0 = s0.toCharArray();
  final static char[] c1 = s1.toCharArray();
  final static byte[] b0 = s0.getBytes();
  final static byte[] b1 = s1.getBytes();

  static AtomicBoolean EXIT = new AtomicBoolean(false);
  static AtomicLong TOTAL = new AtomicLong(0);

  @Override
  public void run() {
    long count = 0;
    try {
      for (;;) {
        // Case 1, search as byte[]
        int idx = indexOf(b0, 0, b0.length, b1, 0, b1.length, 0);
        // Case 2, search as char[]
        // int idx = indexOf(c0, 0, c0.length, c1, 0, c1.length, 0);
        // Case 3, search as String (using String.indexOf())
        // int idx = s0.indexOf(s1);
        if (idx >= 0) {
          count ++;
        }
        if (EXIT.get()) {
          break;
        }
      }
      TOTAL.addAndGet(count);
    } catch(Exception e) {
      e.printStackTrace();
    }
  }

  /* byte version of indexOf, modified from Java JDK source */
  static int indexOf(byte[] source, int sourceOffset, int sourceCount,
      byte[] target, int targetOffset, int targetCount,
      int fromIndex) {
    if (fromIndex >= sourceCount) {
      return (targetCount == 0 ? sourceCount : -1);
    }
    if (fromIndex < 0) {
      fromIndex = 0;
    }
    if (targetCount == 0) {
      return fromIndex;
    }

    byte first  = target[targetOffset];
    int max = sourceOffset + (sourceCount - targetCount);

    for (int i = sourceOffset + fromIndex; i <= max; i++) {
      /* Look for first character. */
      if (source[i] != first) {
        while (++i <= max && source[i] != first) {
          ;
        }
      }

      /* Found first character, now look at the rest of v2 */
      if (i <= max) {
        int j = i + 1;
        int end = j + targetCount - 1;
        for (int k = targetOffset + 1; j < end && source[j] ==
            target[k]; j++, k++) {
          ;
        }

        if (j == end) {
          /* Found whole string. */
          return i - sourceOffset;
        }
      }
    }
    return -1;
  }

  /* char version of indexOf, directly copied from JDK's String class */
  static int indexOf(char[] source, int sourceOffset, int sourceCount,
      char[] target, int targetOffset, int targetCount,
      int fromIndex) {
    if (fromIndex >= sourceCount) {
      return (targetCount == 0 ? sourceCount : -1);
    }
    if (fromIndex < 0) {
      fromIndex = 0;
    }
    if (targetCount == 0) {
      return fromIndex;
    }

    char first  = target[targetOffset];
    int max = sourceOffset + (sourceCount - targetCount);

    for (int i = sourceOffset + fromIndex; i <= max; i++) {
      /* Look for first character. */
      if (source[i] != first) {
        while (++i <= max && source[i] != first) {
          ;
        }
      }

      /* Found first character, now look at the rest of v2 */
      if (i <= max) {
        int j = i + 1;
        int end = j + targetCount - 1;
        for (int k = targetOffset + 1; j < end && source[j] ==
            target[k]; j++, k++) {
          ;
        }

        if (j == end) {
          /* Found whole string. */
          return i - sourceOffset;
        }
      }
    }
    return -1;
  }


  public static void main(String[] args) throws Exception {
    int threads = 4;
    ExecutorService executorService = Executors.newFixedThreadPool(threads);
    for(int i=0; i<threads; i++) {
      executorService.execute(new TestIndexOf());
    }
    Thread.sleep(10000);
    EXIT.set(true);
    System.out.println("STOPPED");
    Thread.sleep(1000);
    System.out.println("Count = " + TOTAL.get());
    System.exit(0);
  }
}

我得到的结果是:(2个样本,运行10秒,有4个线程)

字节[] 224848726 225011695

的char [] 224707442 224707442

的字符串 898161092 897897572

String.indexOf()的神奇之处是什么?这会得到硬件加速吗? :P

1 个答案:

答案 0 :(得分:5)

JVM对标准库中的某些方法进行了特定优化。 其中一个将用有效的内联汇编替换对String.indexOf的调用。它甚至可以利用SSE4.2 instructions。这很可能会造成这种差异。

有关详细信息,请参阅:src/share/vm/opto/library_call.cpp