在C ++中有效识别字符串数组中的重复的算法

时间:2014-12-07 13:14:32

标签: c++ algorithm

我有一个IP地址列表/数组作为字符串。我需要确定此数组中是否有任何重复项并记录错误。阵列大约有20个元素。什么是识别副本的有效方法?

3 个答案:

答案 0 :(得分:2)

  1. 排序原始数组
  2. 遍历已排序的数组,并计算不同的值
  3. 创建大小为(2)
  4. 的新数组
  5. 将值从原始数据复制到新数组,跳过重复项
  6. bash中的

    伪:

    [user@linux ~]$ cat 1.txt
    1
    2
    3
    66
    1
    1
    66
    3
    7
    7
    7
    7
    26
    
    [user@linux ~]$ cat 1.txt | sort | uniq
    
    1
    2
    26
    3
    66
    7
    [user@linux ~]$ cat 1.txt | sort | uniq | wc -l
           7
    

答案 1 :(得分:2)

您可以使用map<string, int>标记已使用过的地址以及地址首先出现的位置:

void check_dups(const std::vector<std::string>& addresses) {
    std::map<std::string, int> seen;
    for (int i=0,n=addresses.size(); i<n; i++) {
        std::map<std::string, int>::iterator it = seen.find(addreses[i]);
        if (it == seen.end()) {
            // Never used before, mark the position
            seen[addresses[i]] = i;
        } else {
            // Duplicated value, emit a warning
            std::cout << "Duplicate address at index " << i <<
                         " (present already at index " << it->second << ")\n";
        }
    }
}

答案 2 :(得分:0)

这里有3种合理有效的方法,从我的头脑开始:

#include <iostream>
#include <algorithm>
#include <string>
#include <vector>
#include <set>

// returns a sorted, de-duplicated copy
std::vector<std::string> de_duplicated(std::vector<std::string> vec)
{
    std::set<std::string> interim { vec.begin(), vec.end() };
    vec.assign(interim.begin(), interim.end());
    return vec;
}

// sorts and de-duplicates in place
void de_duplicate(std::vector<std::string>& vec)
{
    std::sort(std::begin(vec), std::end(vec));

    auto current = std::begin(vec);

    do {
        auto last = std::end(vec);
        current = std::adjacent_find(current, last);
        if (current != last) {
            auto last_same = std::find_if_not(std::next(current),
                                              last,
                                              [&current](const std::string& s) {
                                                  return s == *current;
                                              });
            current = vec.erase(std::next(current), last_same);
        }
    } while(current != std::end(vec));

}

// returns a de-duplicated copy, preserving order
std::vector<std::string> de_duplicated_stable(const std::vector<std::string>& vec)
{
    std::set<std::string> index;
    std::vector<std::string> result;
    for (const auto& s : vec) {
        if (index.insert(s).second) {
            result.push_back(s);
        }
    }

    return result;
}





using namespace std;


int main() {

    std::vector<std::string> addresses { "d", "a", "c", "d", "c", "a", "c", "d" };

    cout << "before" << endl;
    std::copy(begin(addresses), end(addresses), ostream_iterator<string>(cout, ", "));
    cout << endl;

    auto deduplicated = de_duplicated(addresses);
    cout << endl << "sorted, de-duplicated copy" << endl;
    std::copy(begin(deduplicated), end(deduplicated), ostream_iterator<string>(cout, ", "));
    cout << endl;

    deduplicated = de_duplicated_stable(addresses);
    cout << endl << "sorted, stable copy" << endl;
    std::copy(begin(deduplicated), end(deduplicated), ostream_iterator<string>(cout, ", "));
    cout << endl;

    de_duplicate(addresses);
    cout << endl << "sorted, de-duplicated in-place" << endl;
    std::copy(begin(addresses), end(addresses), ostream_iterator<string>(cout, ", "));
    cout << endl;

    return 0;
}