Question

我写了一个类（贪婪的策略），起初我使用了排序方法，它有O（nlogn）

Collections.sort（array，new SortingObjectsWithProbabilityField（））;

然后我使用了binary search tree的插入方法，它采用了O(h)和h here is the tree height.

对于不同的n，运行时间为：

n,running time
17,515428
33,783340
65,540572
129,1285080
257,2052216
513,4299709

我认为这是不正确的，因为增加n，运行时间应该几乎增加。

此方法将占用运行时间：

Exponent = -1;



for(int n = 2;n<1000;n+=Math.pow(2,exponent){

     for (int j = 1; j <= 3; j++) {
                Random rand = new Random();
                for (int i = 0; i < n; i++) {
                    Element e = new Element(rand.nextInt(100) + 1, rand.nextInt(100) + 1, 0);
                    for (int k = 0; k < i; k++) {
                        if (e.getDigit() == randList.get(k).getDigit()) {
                            e.setDigit(e.getDigit() + 1);
                        }
                    }
                    randList.add(e);
                }
                double sum = 0.0;

                for (int i = 0; i < randList.size(); i++) {
                    sum += randList.get(i).getProbability();
                }
                for (Element i : randList) {
                    i.setProbability(i.getProbability() / sum);
                }
                //Get time.
                long t2 = System.nanoTime();
                GreedyVersion greedy = new GreedyVersion((ArrayList<Element>) randList);
                long t3 = System.nanoTime();
                timeForGreedy = timeForGreedy + t3 - t2;
            }
            System.out.println(n + "," + "," + timeForGreedy/3 );
            exponent++;
            }

感谢

Answer 1

您的数据似乎大致符合nlogn的顺序，如下所示。请注意，曲线几乎是线性的，因为对于n的大值，logn非常小。例如，对于最大值n = 513，logn为9.003。

有一些方法可以实现更准确的计时，这可能会使曲线更好地适应数据点。比如采用更大的随机输入样本（我建议至少10,100，如果可能的话）并且每个数据集运行多次迭代（5是可接受的数字）以消除计时器的不准确性。您可以使用单个启动/停止计时器为相同的n计算所有迭代的时间，然后除以运行次数，以获得更准确的数据点。只需确保首先生成所有数据集，将它们全部存储起来，然后再运行它们。

以2的幂对n进行采样是个不错的选择。你可能只想减去1来使它们具有2的幂，而不是它会产生任何真正的影响。

供参考，这是用于生成图的gnuplot脚本：

set terminal png
set output 'graph.png'
set xrange [0:5000000]
set yrange [0:600]
f1(x) = a1*x*log(x)/log(2)
a1 = 1000
plot 'time.dat' title 'Actual runtimes', \
    a1*x*log(x)/log(2) title 'Fitted curve: O(nlogn)
fit f1(x) 'time.dat' via a1

Answer 2

将渐近复杂度与运行时间联系起来并不容易。如果样本太小，会有很多事情影响你的时间。

为了获得更准确的时间，你应该每个实例运行你的算法K次（例如K次17次，K次33次等等）并将平均时间作为采样点（例如K = 100）

那说它看起来是正确的。你可以绘制nlog（n）与你的时间的关系，你会发现，尽管它们的规模不同，但它们的增长速度相似。仍然太少的样本点确定...

运行时间与O（nlogn）匹配吗？

2 个答案: