在C ++中对数字数组进行排序时,是否必须将字符串转换为double?

时间:2016-05-14 09:20:36

标签: c++ sorting

我想用c ++对数组进行排序,其元素是“字符串”类型,但应该是数字(由于其他一些原因我不能将它们存储为双数字),如下所示:

vector<string> a;

//assign a with some values, e.g. a=["5.1" "3.5" "1.4" "0.2"]

sort(a.begin(),a.end());

所以我的问题是:在调用sort之前,是否需要将vector中的每个元素从string转换为double? c ++如何对这样的数组进行排序以及它有多精确?谢谢!

4 个答案:

答案 0 :(得分:4)

  

在调用SELECT DISTINCT ItemType, CASE WHEN Price BETWEEN 0 AND 499 THEN 'Cheap' Price BETWEEN 500 AND 1999 THEN 'Affordable' Price BETWEEN 2000 AND 4999 THEN 'Expensive' Price >=5000 THEN 'Very Expensive' END Classification FROM Item ORDER BY ItemType, Classification 之前,是否需要将矢量中的每个元素从string转换为double

这取决于您希望实现的顺序以及存储在数组中的数字。如果所有数字在点之前只有一位数字,那么你就不会看到差异;如果某些数字具有多位数的整数部分,则排序将是不正确的,因为字符串是按字母顺序排列的。例如,sort"2.0""9.0"的排序方式如下:

"10.0"

至于&#34;之前&#34; in&#34;在致电"10.0", "2.0", "9.0" &#34;之前关注,转换不需要在排序之前进行;如果你使用自定义比较函数,你可以随意执行:

sort

Demo.

答案 1 :(得分:1)

试试这个:

string n1 = "5.2";
string n2 = "10.1";
if (n1 < n2) {
    cout << "n1 is less" << endl;
} else {
    cout << "n2 is less" << endl;
}

输出为n2 is less,因为字符'5'大于字符'1'。如果直接比较字符串会发生这种情况。

答案 2 :(得分:1)

C ++将字符串的每个字符转换为ascii代码,然后按ascii顺序排序,所以这不起作用,你必须将每个字符串转换为数字类型才能正确排序。

答案 3 :(得分:1)

将字符串转换为double的成本实际上非常高。

这种天真的方法是比较谓词stod(l) < stod(r)

如下面的测试所示,在要排序的矢量很大的情况下,实际上值得执行一次转换并对转换后的矢量进行排序。

以下是优化算法:

template<class Vector>
void sort_numeric_single_conversion(Vector& vec)
{
    auto first = vec.begin();
    auto last = vec.end();
    auto size = vec.size();

    using element = std::tuple<double, std::size_t>;
    std::vector<element> elements;
    elements.reserve(size);
    for (auto current = first ; current != last ; ++current)
    {
        elements.emplace_back(stod(*current), current - first);
    }
    std::sort(std::begin(elements), std::end(elements), [](auto& l, auto &r) {
        return std::get<double>(l) < std::get<double>(r);
    });

    Vector buffer;
    buffer.reserve(size);
    for(auto& elem : elements)
    {
        auto isource = std::get<std::size_t>(elem);
        buffer.push_back(std::move(vec[isource]));
    }
    vec = std::move(buffer);
}

这就是为这种方法辩护的测试(在macbook pro上用-O3 -march=native编译:

#include <string>
#include <algorithm>
#include <vector>
#include <utility>
#include <tuple>
#include <chrono>
#include <random>
#include <iterator>
#include <iostream>
#include <sstream>
#include <iomanip>


template<class Vector>
void sort_numeric_single_conversion(Vector& vec)
{
    auto first = vec.begin();
    auto last = vec.end();
    auto size = vec.size();

    using element = std::tuple<double, std::size_t>;
    std::vector<element> elements;
    elements.reserve(size);
    for (auto current = first ; current != last ; ++current)
    {
        elements.emplace_back(stod(*current), current - first);
    }
    std::sort(std::begin(elements), std::end(elements), [](auto& l, auto &r) {
        return std::get<double>(l) < std::get<double>(r);
    });

    Vector buffer;
    buffer.reserve(size);
    for(auto& elem : elements)
    {
        auto isource = std::get<std::size_t>(elem);
        buffer.push_back(std::move(vec[isource]));
    }
    vec = std::move(buffer);
}

template<class Vector>
void sort_numeric_naive(Vector& vec)
{
    std::sort(vec.begin(), vec.end(), [](auto& l, auto& r) {
        return stod(l) < stod(r);
    });
}

std::vector<std::string> build_test_array(std::size_t size)
{
    std::vector<std::string> result;
    std::random_device rd;
    std::default_random_engine eng(rd());
    auto dist = std::uniform_real_distribution<double>(-1000000.0, 1000000.0);
    std::generate_n(std::back_inserter(result), size, [&] {
        return std::to_string(dist(eng));
    });
    return result;
}

template<class F>
auto time(F f)
{
    auto now = std::chrono::high_resolution_clock::now();
    f();
    auto then = std::chrono::high_resolution_clock::now();
    return then - now;
}

template<class Duration>
std::string to_ms(Duration d)
{
    auto ms = std::chrono::duration_cast<std::chrono::milliseconds>(d);
    std::ostringstream ss;
    ss << std::setw(7) << ms.count() << "ms";
    return ss.str();

}

int main()
{
    for(auto size : { 10, 100, 1000, 10000, 100000, 1000000, 10000000 })
    {
        auto v1 = build_test_array(size);
        auto v2 = v1;

        auto t1 = time([&] { sort_numeric_single_conversion(v1); });
        auto t2 = time([&] { sort_numeric_naive(v2); });

        std::cout << "size: " << std::setw(7) << size << " single: " << to_ms(t1) << ", naive: " << to_ms(t2) << std::endl;
    }

}

典型结果:

size:      10 single:       0ms, naive:       0ms
size:     100 single:       0ms, naive:       0ms
size:    1000 single:       0ms, naive:       1ms
size:   10000 single:       1ms, naive:      17ms
size:  100000 single:      18ms, naive:     235ms
size: 1000000 single:     210ms, naive:    2797ms
size: 10000000 single:    2397ms, naive:   33120ms