Question

我试图解决这个练习http://main.edu.pl/en/archive/amppz/2014/dzi，我不知道如何改善我的代码的性能。当程序必须处理超过500,000个唯一编号时出现问题（如描述中最多2,000,000）。然后花了1-8秒来遍历所有这些数字。我使用的测试来自http://main.edu.pl/en/user.phtml?op=tests&c=52014&task=1263，我通过命令测试它 program.exe < data.in > result.out

描述： You are given a sequence of _n integer a₁, a₂, ... a_n. You should determine the number of such ordered pairs(_i, _j), that _i, _j equeals(1, ..., _n), _i != _j and a_i is divisor of a_j. The first line of input contains one integer _n(1 <= _n <= 2000000) The second line contains a sequence of _n integers a₁, a₂, ..., a_n(1 <= a_i <= 2000000). In the first and only line of output should contain one integer, denoting the number of pairs sought. For the input data: 5 2 4 5 2 6 the correct answer is: 6 Explanation: There are 6 pars: (1, 2) = 4/2, (1, 4) = 2/2, (1, 5) = 6/2, (4, 1) = 2/2, (4, 2) = 4/2, (4, 5) = 6/2.

例如：
- 总数为2M，独特数为635k，总共有345万次迭代 - 总数为2M，单数为2mln，共计1885mln次

#include <iostream>
#include <math.h>
#include <algorithm>

#include <time.h>


#define COUNT_SAME(count) (count - 1) * count


int main(int argc, char **argv) {
    std::ios_base::sync_with_stdio(0);

    int n; // Total numbers
    scanf("%d", &n);

    clock_t start, finish;
    double  duration;

    int minVal = 2000000;
    long long *countVect = new long long[2000001]; // 1-2,000,000; Here I'm counting duplicates

    unsigned long long counter = 0;
    unsigned long long operations = 0;

    int tmp;
    int duplicates = 0;

    for (int i = 0; i < n; i++) {
        scanf("%d", &tmp);

        if (countVect[tmp] > 0) { // Not best way, but works
            ++countVect[tmp];
            ++duplicates;
        } else {
            if (minVal > tmp)
                minVal = tmp;

            countVect[tmp] = 1;
        }
    }

    start = clock();

    int valueJ;
    int sqrtValue, valueIJ;
    int j;

    for (int i = 2000000; i > 0; --i) {
        if (countVect[i] > 0) { // Not all fields are setted up
            if (countVect[i] > 1) 
                counter += COUNT_SAME(countVect[i]); // Sum same values

            sqrtValue = sqrt(i);

            for (j = minVal; j <= sqrtValue; ++j) {
                if (i % j == 0) {
                    valueIJ = i / j;

                    if (valueIJ != i && countVect[valueIJ] > 0 && valueIJ > sqrtValue)
                        counter += countVect[i] * countVect[valueIJ];

                    if (i != j && countVect[j] > 0)
                        counter += countVect[i] * countVect[j];
                }

                ++operations;
            }
        }
    }

    finish = clock();
    duration = (double)(finish - start) / CLOCKS_PER_SEC;
    printf("Loops time: %2.3f", duration);
    std::cout << "s\n";
    std::cout << "\n\nCounter: " << counter << "\n";
    std::cout << "Total operations: " << operations;

    std::cout << "\nDuplicates: " << duplicates << "/" << n;
    return 0;
}

我知道，我不应该在开始时对数组进行排序，但我不知道如何以更好的方式对其进行排序。

任何提示都会很棒，谢谢！

这是改进的算法 - 在0.5秒内的2M唯一数字。感谢@PJTraill！

#include <iostream>
#include <math.h>
#include <algorithm>

#include <time.h>


#define COUNT_SAME(count) (count - 1) * count


int main(int argc, char **argv) {
    std::ios_base::sync_with_stdio(0);

    int n; // Total numbers
    scanf("%d", &n);

    clock_t start, finish;
    double  duration;

    int maxVal = 0;
    long long *countVect = new long long[2000001]; // 1-2,000,000; Here I'm counting duplicates

    unsigned long long counter = 0;
    unsigned long long operations = 0;

    int tmp;
    int duplicates = 0;

    for (int i = 0; i < n; i++) {
        scanf("%d", &tmp);

        if (countVect[tmp] > 0) { // Not best way, but works
            ++countVect[tmp];
            ++duplicates;
        } else {
            if (maxVal < tmp)
                maxVal = tmp;

            countVect[tmp] = 1;
        }
    }

    start = clock();

    int j;
    int jCounter = 1;

    for (int i = 0; i <= maxVal; ++i) {
        if (countVect[i] > 0) { // Not all fields are setted up
            if (countVect[i] > 1)
                counter += COUNT_SAME(countVect[i]); // Sum same values

            j = i * ++jCounter;

            while (j <= maxVal) {
                if (countVect[j] > 0)
                    counter += countVect[i] * countVect[j];

                j = i * ++jCounter;
                ++operations;
            }

            jCounter = 1;
        }
    }

    finish = clock();
    duration = (double)(finish - start) / CLOCKS_PER_SEC;
    printf("Loops time: %2.3f", duration);
    std::cout << "s\n";
    std::cout << "\n\nCounter: " << counter << "\n";
    std::cout << "Total operations: " << operations;

    std::cout << "\nDuplicates: " << duplicates << "/" << n;
    return 0;
}

Answer 1

我希望以下内容的工作速度比OP的算法快得多（ optimisations oblique ）：

（值和频率的类型应为32位无符号，计数为64位 - 在计算计数之前提升，如果您的语言不符合。）
读取值的数量，N。
读取每个值v，在频率freq [v]中加1：不需要存储它。
- （freq [MAX]（或MAX + 1）可以静态分配给所有0的最佳初始化
计算从freq [1]中包含1的对的数量和值的数量。
对于2..MAX 中的每个i（使用freq [i]＆gt; 0）：
- 从freq [i]计算对（i，i）的数量。
- 对于2m..MAX中的每m个i：
  - （使用m作为循环计数器并增加它，而不是相乘）
  - 从freq [i]和freq [m]计算对（i，m）的数量。
- （如果freq [i] = 1，可以省略（i，i）计算并执行针对freq [i] = 1优化的循环的变体）
（可以从2..MAX / 2执行上一个（外部）循环，然后从MAX / 2 + 1..MAX中执行省略多次处理的处理）

对的数量（i，i）= _{freq [i]} C ₂ =（freq [i] *（freq [我] - 1））/ 2。
对于i≠j，对的数量（i，j）= freq [i] * freq [j]。

这可以避免排序，sqrt和分割。

其他优化

可以存储不同的值，然后扫描该数组（顺序无关紧要）;由此引起的增益或损失取决于1..MAX中值的密度。

如果最大频率<1。 2 ¹⁶，听起来非常可能，所有产品都适合32位。可以通过使用数字类型作为模板编写函数，跟踪最大频率，然后为其余部分选择适当的模板实例来利用这一点。这需要N *（比较+分支），并且可以通过用32位而不是64位执行D ²乘法来获得，其中D是不同值的数量。除了N＆lt; N＆lt; N＆gt;之外，我认为没有简单的方法可以推断32位总数就足够了。 2 ¹⁶

如果对 n 处理器进行并行处理，可以让不同的处理器处理以 n 为模的不同残差。

我考虑过跟踪偶数值的数量，以避免扫描一半的频率，但我认为对于给定参数内的大多数数据集而言，它们几乎没有优势。

Answer 2

好的，我不打算为你编写整个算法，但绝对可以更快地完成。所以我想这就是你需要做的事情：

所以你对列表进行了排序，因此你可以从中做出很多假设。以最高价值为例。它不会有任何倍数。最高值，最高值除以2。

这里还有另外一个非常有用的事实。倍数的倍数也是倍数。（仍然遵循？;））。以列表[2 4 12]为例。现在你发现（4,12）是一对多对。如果你现在也找到（2,4），那么你可以推断出12也是2的倍数。而且由于你只需要计算对数，你可以只为每个数字保持一个计数它有多少倍，并在你看到这个数字本身时加上它。这意味着最好向后迭代排序列表，而不是寻找除数。

并且可能以某种方式存储它 [ (three 2's ), (two 5's), ...] 即。存储一个数字出现的频率。再一次，你不必跟踪它的身份，因为你只需要给它们总数。以这种方式存储您的列表会对您有所帮助，因为所有2都将具有相同的倍数。所以计算一次然后再乘。

找到除数对

2 个答案:

其他优化