如何删除带有顺序的向量中的重复值?

时间:2017-10-11 13:47:27

标签: c++ string file vector stl

我有一个包含3列的txt文件:"dd/mm/yyyy HH:MM:SS number(000.000)"。大约有368个条目。

我想选择strings,其中第3列的值是唯一的(第一次见面)。订单很重要。

在我的代码中,我在vectordtp)中读取文件,然后在columnvectordata中填写每个timepressure)。然后我从第3列删除值并获取this

我的问题是如何使用正确的索引添加第1列和第2列并获取this

数据示例(第15期值):

26.07.2017  15:47:38    82.431 
26.07.2017  16:47:46    83.431
26.07.2017  17:47:54    85.431
26.07.2017  18:48:02    84.431
26.07.2017  19:48:09    83.431
26.07.2017  20:48:17    83.431
26.07.2017  21:48:24    84.431
26.07.2017  22:48:32    83.431
26.07.2017  23:48:40    83.431
27.07.2017  00:48:48    84.431
27.07.2017  01:48:55    84.431
27.07.2017  02:49:03    84.431
27.07.2017  03:49:10    84.431
27.07.2017  04:49:19    84.431
27.07.2017  05:49:27    86.431

代码:

include <iostream> 
include <fstream> 
include <string> 
include <algorithm> 
include <iterator> 
include <sstream> 
include <vector> 
include <cstring> 
include <ctime>

using namespace std;

int main()
{
    const clock_t start = clock(); 
    system("mode con cols=50 lines=1000"); 
    setlocale(LC_ALL, "Russian"); 

    vector<string> dtp; 
    vector<string> data;
    vector<string> time;
    vector<double> pressure;
    double num(0.0); 
    string line, tmp1, tmp2; 
    int len = 368;
    int f, i, j, k;

    ifstream file("data.txt"); 

    while (!getline(file, line).eof()) 
        dtp.push_back(line); 

    for (string &it : dtp) 
    { 
        {
            istringstream isstr(it);
            isstr >> tmp1;
            data.push_back(tmp1);
        }

        {
            istringstream isstr(it);
            isstr >> tmp1 >> tmp2;
            time.push_back(tmp2);
        }

        {
            istringstream isstr(it);
            isstr >> tmp1 >> tmp2 >> num;
            pressure.push_back(num);
        }

    }

    f = 0;
    for (i = 0; i < len; i++) 
    {
        for (j = i + 1; j < len; j++) 
        {
            if (pressure[i] == pressure[j]) 
            {
                for (k = j; k < (len - 1); k++)
                    pressure[k] = pressure[k + 1];

                len--;
                j--;
                f = 1;
            }
        }
    }

    if (f == 1)
    {
        for (i = 0; i < len; i++)
            cout << pressure[i] << endl;
    }

    const double vremya = static_cast<double>(clock() - start) / CLOCKS_PER_SEC; 
    cout << "Time is: " << vremya << " seconds" << endl; 
    system("pause");
    return 0;
}

2 个答案:

答案 0 :(得分:1)

我认为你最好把它想象成一个有两列的表:

Timestamp Pressure

与之合作。使用时间戳有助于使用date/time library which can parse, format and order time stamps

这是它的样子。代码下方的详细说明:

#include "date/date.h"
#include <algorithm>
#include <iostream>
#include <sstream>
#include <utility>
#include <vector>

std::istringstream file
{
    "26.07.2017  15:47:38    82.431\n"
    "26.07.2017  16:47:46    83.431\n"
    "26.07.2017  17:47:54    85.431\n"
    "26.07.2017  18:48:02    84.431\n"
    "26.07.2017  19:48:09    83.431\n"
    "26.07.2017  20:48:17    83.431\n"
    "26.07.2017  21:48:24    84.431\n"
    "26.07.2017  22:48:32    83.431\n"
    "26.07.2017  23:48:40    83.431\n"
    "27.07.2017  00:48:48    84.431\n"
    "27.07.2017  01:48:55    84.431\n"
    "27.07.2017  02:49:03    84.431\n"
    "27.07.2017  03:49:10    84.431\n"
    "27.07.2017  04:49:19    84.431\n"
    "27.07.2017  05:49:27    86.431\n"
};

int
main()
{
    using record = std::pair<date::sys_seconds, double>;
    std::vector<record> records;
    while (file)
    {
        record r;
        file >> date::parse(" %d.%m.%Y %T", r.first) >> r.second;
        if (file.fail())
            break;
        records.push_back(std::move(r));
    }
    std::sort(records.begin(), records.end(), [](const auto& x, const auto& y)
                                                  {return x.first < y.first;});
    std::stable_sort(records.begin(), records.end(),
                     [](const auto& x, const auto& y)
                         {return x.second < y.second;});
    records.erase(std::unique(records.begin(), records.end(),
                              [](const auto& x, const auto& y)
                                  {return x.second == y.second;}),
                  records.end());
    std::sort(records.begin(), records.end(), [](const auto& x, const auto& y)
                                                    {return x.first < y.first;});
    for (const auto& r : records)
        std::cout << date::format("%d.%m.%Y %T ", r.first) << r.second << '\n';
}

为了便于演示,我已将data.tx放入istringstream。不要让细节绊倒你。对于mainistringstream,<{1}}同样可以正常使用。

我正在重复ifstream作为我的std::pair,但如果您愿意,可以编写自己的record结构。无论如何,您希望从数据库中收集record。这就是vector<record>循环的作用。此循环使用Howard Hinnant's free, open-source date/time library来解析时间戳,但您也可以使用其他几种解决方案。

从数据库中填写while后,有三个records将为您完成此任务(分为4个步骤):

  1. 按时间戳排序std::algorithms

  2. 通过压力稳定排序records。对于相等的压力,这将保留时间戳的排序顺序。

  3. 独特的压力列表。此算法将重复压力移动到列表的后面,并将迭代器返回到列表的“新结束”。然后,您需要清除records

  4. 中的所有内容
  5. 如果您希望按时间顺序查看列表,请按时间戳排序。

  6. 你已经完成了!只需打印出来。这将输出:

    [new_end, old_end)

    匹配所需输出的前缀。

答案 1 :(得分:0)

您的插入订单似乎与日期/时间顺序相对应。如果是这种情况,您可以执行以下操作:

struct Record
{
    std::chrono::system_clock::time_point time_point;
    double pressure;
};

std::vector<Record> records = /**/;

const auto lessPressure = [](const auto& lhs, const auto& lhs){
    return lhs.pressure < rhs.pressure;
};
std::stable_sort(records.begin(), records.end(), lessPressure);
const auto equalPressure = [](const auto& lhs, const auto& lhs){
    return lhs.pressure == rhs.pressure;
};
records.erase(std::unique(records.begin(), records.end(), equalPressure), records.end());
const auto lessTimePoint = [](const auto& lhs, const auto& lhs){
    return lhs.time_point< rhs.time_point;
};
std::sort(records.begin(), records.end(), lessTimePoint );

否则,您必须从代码中将pressure向量所做的更改报告给其他数据:

所以改变:

for (k = j; k < (len - 1); k++)
    pressure[k] = pressure[k + 1];

for (k = j; k < (len - 1); k++) {
    pressure[k] = pressure[k + 1];
    data[k] = data[k + 1];
    time[k] = time[k + 1];
}

甚至

pressure.erase(pressure.begin() + j);
data.erase(data.begin() + j);
time.erase(time.begin() + j);