我想分配一个长度为N的新数组,并通过重复给定数组来填充它。界面如下所示:
<T> T[] repeat(T[] array, int n);
澄清我的意思这里有一个小例子:
String a = {"a", "b", "c"};
// b = {"a", "b", "c", "a", "b", "c", "a", "b", "c", "a"}
String b = repeat(a, 10);
大多数程序员都会提出以下解决方案(为简化数组生成,选择了特定类型):
public String[] repeat(String[] array, int n) {
String[] repeated = new String[n];
for (int i = 0; i < n; i++) {
repeated[i] = array[i % array.length];
}
return repeated;
}
有更快的方法吗?
答案 0 :(得分:13)
我想出了这个通用的解决方案:
public static <T> T[] repeat(T[] arr, int newLength) {
T[] dup = Arrays.copyOf(arr, newLength);
for (long last = arr.length; last != 0 && last < newLength; last <<= 1) {
System.arraycopy(dup, 0, dup, (int) last, (int) (Math.min(last << 1, newLength) - last));
}
return dup;
}
System.arraycopy
是本地电话。因此它非常快,但并不意味着它是最快的方式。
每个其他解决方案都按元素复制数组元素。我的解决方案复制更大的块。每次迭代都会复制数组中的现有元素,这意味着循环最多只能运行 log2(n)次。
TEST_ARRAY = { "a", "b", "c", "d", "e", "f" }
NEW_LENGTH = 10000
以下是重现结果的基准代码:
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Threads;
import org.openjdk.jmh.annotations.Warmup;
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@Measurement(iterations = 10, timeUnit = TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
@Threads(1)
@Warmup(iterations = 5, timeUnit = TimeUnit.NANOSECONDS)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class MyBenchmark {
private static final String[] TEST_ARRAY = { "a", "b", "c", "d", "e", "f" };
private static final int NEW_LENGTH = 10_000;
@Benchmark
public String[] testNativeCall() {
String[] dup = Arrays.copyOf(TEST_ARRAY, NEW_LENGTH);
for (int last = TEST_ARRAY.length; last != 0 && last < NEW_LENGTH; last <<= 1) {
System.arraycopy(dup, 0, dup, last, Math.min(last << 1, NEW_LENGTH) - last);
}
return dup;
}
@Benchmark
public String[] testLoopModulo() {
String[] arr = new String[NEW_LENGTH];
for (int i = 0; i < NEW_LENGTH; i++) {
arr[i] = arr[i % TEST_ARRAY.length];
}
return arr;
}
@Benchmark
public String[] testArrayList() {
List<String> initialLetters = Arrays.asList(TEST_ARRAY);
List<String> results = new ArrayList<>();
int indexOfLetterToAdd = 0;
for (int i = 0; i < 10000; i++) {
results.add(initialLetters.get(indexOfLetterToAdd++));
if (indexOfLetterToAdd == initialLetters.size()) {
indexOfLetterToAdd = 0;
}
}
return results.toArray(new String[results.size()]);
}
@Benchmark
public String[] testLoopReset() {
String result[] = new String[NEW_LENGTH];
for (int i = 0, j = 0; i < NEW_LENGTH && j < TEST_ARRAY.length; i++, j++) {
result[i] = TEST_ARRAY[j];
if (j == TEST_ARRAY.length - 1) {
j = -1;
}
}
return result;
}
@Benchmark
public String[] testStream() {
String[] result = Stream.iterate(TEST_ARRAY, x -> x).flatMap(x -> Stream.of(TEST_ARRAY)).limit(NEW_LENGTH)
.toArray(String[]::new);
return result;
}
}
Benchmark Mode Cnt Score Error Units
MyBenchmark.testNativeCall avgt 30 4154,553 ± 11,242 ns/op
MyBenchmark.testLoopModulo avgt 30 19273,717 ± 235,547 ns/op
MyBenchmark.testArrayList avgt 30 71079,139 ± 2686,136 ns/op
MyBenchmark.testLoopReset avgt 30 18307,368 ± 202,520 ns/op
MyBenchmark.testStream avgt 30 68898,278 ± 2488,104 ns/op
正如您所看到的,本机调用方法是重复数组的最快方法。
此外,我被要求用各种输入对这些方法进行基准测试。
// Array size not fixed anymore - filled with random elements
SIZE = { 100, 1000, 100000, 1000000 }
NEW_LENGTH = { 100, 1000, 100000, 1000000 }
这意味着有SIZE x NEW_LENGTH
次测试,结果如下:
Benchmark (NEW_LENGTH) (SIZE) Mode Cnt Score Error Units
MyBenchmark.testArrayList 100 100 avgt 30 706,274 ± 6,787 ns/op
MyBenchmark.testArrayList 100 1000 avgt 30 692,586 ± 15,076 ns/op
MyBenchmark.testArrayList 100 100000 avgt 30 685,214 ± 6,747 ns/op
MyBenchmark.testArrayList 100 1000000 avgt 30 685,333 ± 5,493 ns/op
MyBenchmark.testArrayList 1000 100 avgt 30 7170,897 ± 63,221 ns/op
MyBenchmark.testArrayList 1000 1000 avgt 30 7180,612 ± 93,280 ns/op
MyBenchmark.testArrayList 1000 100000 avgt 30 6818,585 ± 197,859 ns/op
MyBenchmark.testArrayList 1000 1000000 avgt 30 6810,614 ± 139,456 ns/op
MyBenchmark.testArrayList 100000 100 avgt 30 597614,173 ± 6446,318 ns/op
MyBenchmark.testArrayList 100000 1000 avgt 30 580696,750 ± 5141,845 ns/op
MyBenchmark.testArrayList 100000 100000 avgt 30 598657,608 ± 5126,519 ns/op
MyBenchmark.testArrayList 100000 1000000 avgt 30 595529,027 ± 4981,095 ns/op
MyBenchmark.testArrayList 1000000 100 avgt 30 6836746,484 ± 38848,467 ns/op
MyBenchmark.testArrayList 1000000 1000 avgt 30 6745066,786 ± 57971,469 ns/op
MyBenchmark.testArrayList 1000000 100000 avgt 30 7130391,072 ± 50583,914 ns/op
MyBenchmark.testArrayList 1000000 1000000 avgt 30 8791342,042 ± 172323,938 ns/op
MyBenchmark.testLoopModulo 100 100 avgt 30 301,252 ± 1,195 ns/op
MyBenchmark.testLoopModulo 100 1000 avgt 30 301,988 ± 2,056 ns/op
MyBenchmark.testLoopModulo 100 100000 avgt 30 299,892 ± 1,776 ns/op
MyBenchmark.testLoopModulo 100 1000000 avgt 30 300,468 ± 2,569 ns/op
MyBenchmark.testLoopModulo 1000 100 avgt 30 3277,018 ± 14,880 ns/op
MyBenchmark.testLoopModulo 1000 1000 avgt 30 3275,648 ± 21,742 ns/op
MyBenchmark.testLoopModulo 1000 100000 avgt 30 3258,570 ± 27,360 ns/op
MyBenchmark.testLoopModulo 1000 1000000 avgt 30 3259,617 ± 28,747 ns/op
MyBenchmark.testLoopModulo 100000 100 avgt 30 321483,331 ± 4320,938 ns/op
MyBenchmark.testLoopModulo 100000 1000 avgt 30 326319,662 ± 2419,602 ns/op
MyBenchmark.testLoopModulo 100000 100000 avgt 30 327027,966 ± 3174,011 ns/op
MyBenchmark.testLoopModulo 100000 1000000 avgt 30 319201,057 ± 4472,220 ns/op
MyBenchmark.testLoopModulo 1000000 100 avgt 30 3053122,364 ± 31814,342 ns/op
MyBenchmark.testLoopModulo 1000000 1000 avgt 30 3134151,676 ± 108227,023 ns/op
MyBenchmark.testLoopModulo 1000000 100000 avgt 30 3220082,188 ± 43925,401 ns/op
MyBenchmark.testLoopModulo 1000000 1000000 avgt 30 3204777,236 ± 25365,542 ns/op
MyBenchmark.testLoopReset 100 100 avgt 30 159,828 ± 1,107 ns/op
MyBenchmark.testLoopReset 100 1000 avgt 30 125,461 ± 0,881 ns/op
MyBenchmark.testLoopReset 100 100000 avgt 30 129,912 ± 7,801 ns/op
MyBenchmark.testLoopReset 100 1000000 avgt 30 134,503 ± 7,602 ns/op
MyBenchmark.testLoopReset 1000 100 avgt 30 1809,207 ± 93,642 ns/op
MyBenchmark.testLoopReset 1000 1000 avgt 30 1728,705 ± 70,808 ns/op
MyBenchmark.testLoopReset 1000 100000 avgt 30 1354,887 ± 9,631 ns/op
MyBenchmark.testLoopReset 1000 1000000 avgt 30 1350,327 ± 15,886 ns/op
MyBenchmark.testLoopReset 100000 100 avgt 30 159680,209 ± 2477,183 ns/op
MyBenchmark.testLoopReset 100000 1000 avgt 30 162030,985 ± 1949,660 ns/op
MyBenchmark.testLoopReset 100000 100000 avgt 30 149299,890 ± 1516,486 ns/op
MyBenchmark.testLoopReset 100000 1000000 avgt 30 136059,242 ± 3090,410 ns/op
MyBenchmark.testLoopReset 1000000 100 avgt 30 1407369,992 ± 12979,717 ns/op
MyBenchmark.testLoopReset 1000000 1000 avgt 30 1447001,173 ± 14979,769 ns/op
MyBenchmark.testLoopReset 1000000 100000 avgt 30 1463913,706 ± 12564,617 ns/op
MyBenchmark.testLoopReset 1000000 1000000 avgt 30 1404701,860 ± 21587,436 ns/op
MyBenchmark.testNativeCall 100 100 avgt 30 58,306 ± 0,669 ns/op
MyBenchmark.testNativeCall 100 1000 avgt 30 57,441 ± 0,590 ns/op
MyBenchmark.testNativeCall 100 100000 avgt 30 57,595 ± 0,386 ns/op
MyBenchmark.testNativeCall 100 1000000 avgt 30 60,196 ± 1,995 ns/op
MyBenchmark.testNativeCall 1000 100 avgt 30 450,808 ± 8,259 ns/op
MyBenchmark.testNativeCall 1000 1000 avgt 30 558,079 ± 5,724 ns/op
MyBenchmark.testNativeCall 1000 100000 avgt 30 557,246 ± 4,873 ns/op
MyBenchmark.testNativeCall 1000 1000000 avgt 30 565,005 ± 9,696 ns/op
MyBenchmark.testNativeCall 100000 100 avgt 30 73074,811 ± 3332,432 ns/op
MyBenchmark.testNativeCall 100000 1000 avgt 30 70970,603 ± 2693,394 ns/op
MyBenchmark.testNativeCall 100000 100000 avgt 30 69907,864 ± 2945,072 ns/op
MyBenchmark.testNativeCall 100000 1000000 avgt 30 74041,205 ± 2599,841 ns/op
MyBenchmark.testNativeCall 1000000 100 avgt 30 790679,353 ± 15672,480 ns/op
MyBenchmark.testNativeCall 1000000 1000 avgt 30 812660,137 ± 25490,999 ns/op
MyBenchmark.testNativeCall 1000000 100000 avgt 30 838094,181 ± 12374,194 ns/op
MyBenchmark.testNativeCall 1000000 1000000 avgt 30 925567,535 ± 19091,943 ns/op
MyBenchmark.testStream 100 100 avgt 30 810,262 ± 54,519 ns/op
MyBenchmark.testStream 100 1000 avgt 30 1344,998 ± 14,792 ns/op
MyBenchmark.testStream 100 100000 avgt 30 159901,562 ± 3453,210 ns/op
MyBenchmark.testStream 100 1000000 avgt 30 1407506,571 ± 419985,287 ns/op
MyBenchmark.testStream 1000 100 avgt 30 6464,099 ± 169,665 ns/op
MyBenchmark.testStream 1000 1000 avgt 30 5869,457 ± 260,297 ns/op
MyBenchmark.testStream 1000 100000 avgt 30 165394,656 ± 4943,362 ns/op
MyBenchmark.testStream 1000 1000000 avgt 30 1352900,959 ± 412849,634 ns/op
MyBenchmark.testStream 100000 100 avgt 30 423531,274 ± 3944,801 ns/op
MyBenchmark.testStream 100000 1000 avgt 30 391727,181 ± 5341,826 ns/op
MyBenchmark.testStream 100000 100000 avgt 30 427462,700 ± 7517,953 ns/op
MyBenchmark.testStream 100000 1000000 avgt 30 981304,769 ± 10206,849 ns/op
MyBenchmark.testStream 1000000 100 avgt 30 4528465,859 ± 72959,405 ns/op
MyBenchmark.testStream 1000000 1000 avgt 30 4121720,516 ± 60283,781 ns/op
MyBenchmark.testStream 1000000 100000 avgt 30 5920334,609 ± 63051,631 ns/op
MyBenchmark.testStream 1000000 1000000 avgt 30 6227476,270 ± 84066,493 ns/op
正如预期的那样,本机调用总是领先(大约比循环版本快2倍)。
答案 1 :(得分:2)
您不需要在每个循环上Min
,您可以将其提取到之后。您还可以在每个循环中将偏移量加倍两次。
然而,我的速度似乎并不快,有时速度更快,有时也不快,但它可以替代Min的呼叫。
Benchmark Mode Samples Score Score error Units
c.e.MyBenchmark.testNativeCall avgt 30 4027.890 26.544 ns/op
c.e.MyBenchmark.westonsRepeat avgt 30 4035.817 49.168 ns/op
代码:
@Benchmark
public String[] westonsRepeat() {
String[] sourceArray = TEST_ARRAY;
int newLength = NEW_LENGTH;
int offset = sourceArray.length;
String[] dup = Arrays.copyOf(sourceArray, newLength);
if (offset == 0 || offset >= newLength) return dup;
int next = offset << 1;
while (next < newLength) {
System.arraycopy(dup, 0, dup, offset, offset);
offset = next;
next <<= 1;
}
System.arraycopy(dup, 0, dup, offset, newLength - offset);
return dup;
}
我尝试的另一件事是计算我可以安全地加倍的次数,而有些事情是我认为有趣的表现的死胡同:
int timesCanDouble = (int)(log2(newLength) - log2(offset));
=>
int timesCanDouble = (int)(log2(newLength/offset));
=>
int timesCanDouble = 31 - numberOfLeadingZeros(newLength/offset));