我一直在测试太慢的DataInputStream.readByte()
方法工作的问题,并发现了有趣但不可理解的问题。我正在使用jdk1.7.0_40
,Windows 7 64 bit
。
考虑我们有一些巨大的字节数组并从中读取数据。让我们比较4种从这个数组中逐字节读取的方法:
ByteArrayInputStream
阅读 - > DataInputStream
ByteArrayInputStream
阅读 - >我们自己的DataInputStream
实施(MyDataInputStream
)ByteArrayInputStream
阅读readByte()
方法DataInputStream
的副本。我发现了以下结果(经过长时间的测试循环迭代):
DataInputStream
服了天花。 2555898090 ns MyDataInputStream
采取了aprox。 2630664298 ns readByte()
复制了309265568 ns 换句话说,我们有奇怪的优化问题:通过对象方法调用执行的相同操作需要花费10倍的工作时间,然后通过“本机”实现。
问题:为什么?。
有关信息:
@Test
public void testBytes1() throws IOException {
byte[] bytes = new byte[1_000_000_000];
Random r = new Random();
for (int i = 0; i < bytes.length; i++)
bytes[i] = (byte) r.nextInt();
do {
System.out.println();
bytes[r.nextInt(1_000_000_000)] = (byte) r.nextInt();
testLoop(bytes);
testDis(bytes);
testMyDis(bytes);
testViaMethod(bytes);
} while (true);
}
private void testDis(byte[] bytes) throws IOException {
long time1 = System.nanoTime();
long c = 0;
try (ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
DataInputStream dis = new DataInputStream(bais)) {
for (int i = 0; i < bytes.length; i++) {
c += dis.readByte();
}
}
long time2 = System.nanoTime();
System.out.println("Dis: \t\t\t\t" + (time2 - time1) + "\t\t\t\t" + c);
}
private void testMyDis(byte[] bytes) throws IOException {
long time1 = System.nanoTime();
long c = 0;
try (ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
MyDataInputStream dis = new MyDataInputStream(bais)) {
for (int i = 0; i < bytes.length; i++) {
c += dis.readByte();
}
}
long time2 = System.nanoTime();
System.out.println("My Dis: \t\t\t" + (time2 - time1) + "\t\t\t\t" + c);
}
private void testViaMethod(byte[] bytes) throws IOException {
long time1 = System.nanoTime();
long c = 0;
try (ByteArrayInputStream bais = new ByteArrayInputStream(bytes)
) {
for (int i = 0; i < bytes.length; i++) {
c += readByte(bais);
}
}
long time2 = System.nanoTime();
System.out.println("Via method: \t\t" + (time2 - time1) + "\t\t\t\t" + c);
}
private void testLoop(byte[] bytes) {
long time1 = System.nanoTime();
long c = 0;
for (int i = 0; i < bytes.length; i++) {
c += bytes[i];
}
long time2 = System.nanoTime();
System.out.println("Loop: \t\t\t\t" + (time2 - time1) + "\t\t\t\t" + c);
}
public final byte readByte(InputStream in) throws IOException {
int ch = in.read();
if (ch < 0)
throw new EOFException();
return (byte)(ch);
}
static class MyDataInputStream implements Closeable {
InputStream in;
MyDataInputStream(InputStream in) {
this.in = in;
}
public final byte readByte() throws IOException {
int ch = in.read();
if (ch < 0)
throw new EOFException();
return (byte)(ch);
}
@Override
public void close() throws IOException {
in.close();
}
}
P.S。更新表示对我的结果有疑问的对象,这是打印输出,使用-XX:+PrintCompilation -verbose:gc -XX:CICompilerCount=1
37 1 java.lang.String::hashCode (55 bytes)
41 2 java.lang.String::charAt (29 bytes)
43 3 java.lang.String::indexOf (70 bytes)
49 4 java.lang.AbstractStringBuilder::ensureCapacityInternal (16 bytes)
52 5 java.lang.AbstractStringBuilder::append (29 bytes)
237 6 java.util.Random::nextInt (7 bytes)
237 9 n sun.misc.Unsafe::compareAndSwapLong (native)
238 7 java.util.concurrent.atomic.AtomicLong::get (5 bytes)
238 8 java.util.concurrent.atomic.AtomicLong::compareAndSet (13 bytes)
239 10 java.util.Random::next (47 bytes)
239 11 % fias.TestArrays::testBytes1 @ 15 (77 bytes)
9645 11 % fias.TestArrays::testBytes1 @ -2 (77 bytes) made not entrant
9646 12 % fias.TestArrays::testLoop @ 10 (77 bytes)
9964 12 % fias.TestArrays::testLoop @ -2 (77 bytes) made not entrant
Loop: 318726397 -500090432
9965 13 java.io.DataInputStream::readByte (23 bytes)
9966 14 s java.io.ByteArrayInputStream::read (36 bytes)
9967 15 % ! fias.TestArrays::testDis @ 37 (279 bytes)
Dis: 2684374258 -500090432
12651 16 fias.TestArrays$MyDataInputStream::readByte (23 bytes)
12652 17 % ! fias.TestArrays::testMyDis @ 37 (279 bytes)
My Dis: 2675570541 -500090432
15327 18 fias.TestArrays::readByte (20 bytes)
15328 19 % ! fias.TestArrays::testViaMethod @ 23 (179 bytes)
Via method: 2367507141 -500090432
17694 20 fias.TestArrays::testLoop (77 bytes)
17699 21 % fias.TestArrays::testLoop @ 10 (77 bytes)
Loop: 374525891 -500090567
18069 22 ! fias.TestArrays::testDis (279 bytes)
Dis: 2674626125 -500090567
20745 23 ! fias.TestArrays::testMyDis (279 bytes)
My Dis: 2671418683 -500090567
23417 24 ! fias.TestArrays::testViaMethod (179 bytes)
Via method: 2359181776 -500090567
Loop: 315081855 -500090663
Dis: 2558738649 -500090663
My Dis: 2627056034 -500090663
Via method: 311692727 -500090663
Loop: 317813286 -500090778
Dis: 2565161726 -500090778
My Dis: 2630665760 -500090778
Via method: 314594434 -500090778
Loop: 313695660 -500090797
Dis: 2568251556 -500090797
My Dis: 2635236578 -500090797
Via method: 311882312 -500090797
Loop: 316781686 -500090929
Dis: 2563535623 -500090929
My Dis: 2638487613 -500090929
Via method: 313170789 -500090929
答案 0 :(得分:3)
令人惊讶的是,理由是在MyDataInputStream
/ DataInputStream
如果我们在try块中移动初始化,性能就像循环/方法调用
private void testMyDis(byte[] bytes) throws IOException {
final long time1 = System.nanoTime();
long c = 0;
try (ByteArrayInputStream bais = new ByteArrayInputStream(bytes)) {
final MyDataInputStream dis = new MyDataInputStream(bais);
for (int i = 0; i < bytes.length; i++) {
c += dis.readByte();
}
}
final long time2 = System.nanoTime();
System.out.println("My Dis: \t\t\t" + (time2 - time1) + "\t\t\t\t" + c);
}
我认为有了这个不必要的资源,JIT就无法使用Range Check Elimination
答案 1 :(得分:-1)
答案一直在测试中。额外的成本归功于函数调用。通常我们鼓励编写简短而干净的函数而不是长函数,并且考虑函数调用的成本非常低。但调用成本仍然大于直接内存访问。
在这种情况下,对于testloop,我们可以估计内存读取成本~3 ns(包括整数运算,例如i ++,c +) 对于其他人来说,有2个额外的函数调用层。每个函数调用约为15 ns。现实我们可以说函数调用非常快。
唯一的一点是每个进程中有2 000 000 000个函数调用,这真的是一个很大的数字。
还有另一个测试用例来证明函数调用成本:不使用任何流,只需通过附加函数调用读取字节:
添加以下功能,
public final long getByte( long c, byte value, int dep ) {
if ( dep > 0 ) {
return getByte( c, value, dep - 1);
}
return c + value;
}
然后在testLoop中调用,如:
c = getByte( c, bytes[i], 2);
然后最终成本增加到同一水平:
循环:4044010718 -499870245
Dis:5182272442 -499870245
我的消息:5228065271 -499870245
通过方法:655108198 -499870245