使用分隔符解析地址“街道;市;州;国家'并将每个区域存储到不同的变量中

时间:2016-11-14 05:42:56

标签: c++ parsing

因此,在解析文本文件后,我无法存储信息。文本文件里面有这样的东西

1234 Main St; Oakland; CA; USA
2134 1st St; San Fransico; CA; USA
etc. etc.

我目前有这些变量,我将用它来存储地址的信息

vector <string> addressInfo;
vector <string> street;
vector <string> city;
vector <string> state;
vector <string> country;

我目前还可以通过该程序删除“;”从文件中使用getline

将所有信息存储到单个向量中
while(read == true)
{
    getline(in, line, ';');
    if (in.fail())
    {
        read = false;
    }
    else
    {
        addressInfo.push_back(line);
    }
}

当我执行for循环以输出addressInfo向量内的内容时,我得到了

1234 Main St
Oakland
CA
USA
etc. etc.

我知道我可能不得不使用stringstream但我不知道如何将矢量中的每一行存储到不同的变量中。

3 个答案:

答案 0 :(得分:0)

我认为您不应该存储这样的数据:

vector <string> addressInfo;
vector <string> street;
vector <string> city;
vector <string> state;
vector <string> country;

我认为它应该是这样的:

struct address_info {
  std::string street;
  std::string city;
  std::string state;
  std::string country;
  address_info() {}

  // from C++11, I prefer below style
  //address_info() = default;
  address_info(std::string street_, std::string city_, std::string state_, std::string country_)
    : street(street_), city(city_), state(state_), country(country_)
  {}
};
int main()
{
   std::vector<address_info> list;
   // Let's assume that you know how to get this
   std::string line = "1234 Main St; Oakland; CA; USA";
   std::string street;
   std::string city;
   std::string state;
   std::string country;
   std::istringstream iss(line);
   // remember to trim the string, I don't put it here
   getline(iss, street, ';');
   getline(iss, city, ';');
   getline(iss, state, ';');
   getline(iss, country, ';');

   // This is the C++11 code to add to vector
   //list.emplace_back(street, city, state, country);

   // Pre-C++11 style
   list.push_back(address_info(street, city, state, country));
}

无论如何,你可以去搜索一个csv库。

答案 1 :(得分:0)

这是使用标记化算法的c ++ 14版本(非常类似于STL样式)。它的c ++ 14只是因为我使用的是通用lambda,但也很容易兼容c ++ 11。

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <iterator>

template <typename Iter, typename Executor>
void for_each_token(Iter first, Iter last, 
                    Iter dfirst, Iter dlast,
                    Executor ex)
{
  if (first == last) return;
  auto tmp = first;
  while (first != last) {
    first = std::find_first_of(first, last, dfirst, dlast);
    ex(tmp, first);
    if (first == last) break;
    first++;
    tmp = first;
  }
  return;
}

template <typename Executor>
void for_each_token_str(const std::string& str, const std::string& delims, Executor ex)
{
  for_each_token(std::begin(str), std::end(str), std::begin(delims), std::end(delims), ex);
}

int main() {
  std::ifstream in("parse.txt");
  if (not in) return 1;

  std::string line;
  std::vector<std::string> tokens;

  std::vector <std::string> addressInfo;
  std::vector <std::string> city;
  std::vector <std::string> state;
  std::vector <std::string> country;

  while (std::getline(in, line)) {
    for_each_token_str(line, ";", [&](auto f, auto l) {
          tokens.emplace_back(f, l);
        });

    int idx = 0;
    addressInfo.emplace_back(tokens[idx++]);
    city.emplace_back(tokens[idx++]);
    state.emplace_back(tokens[idx++]);
    country.emplace_back(tokens[idx++]);

    tokens.clear();
  }

  auto print = [](std::vector<std::string>& v) {
    for (auto & e : v) std::cout << e << ' ';
    std::cout << std::endl;
  };

  print(addressInfo);
  print(city);
  print(state);
  print(country);

  return 0;
}

我假设您正在为SOA(数组结构)原理之后的每个字段使用向量。如果没有,我宁愿将它们分组在一个结构中。

注意:我已经跳过了一些错误检查,你不应该这样做。

答案 2 :(得分:0)

Push_back相应向量中的名称/字符串。 newlinegetline的默认分隔符。

string street_name;
string city_name;
string state_name;
string country_name;

while(getline(cin, street_name, ';') && getline(cin, city_name, ';') && 
      getline(cin, state_name, ';') && getline(cin, country_name))
{
    street.push_back(street_name);
    city.push_back(city_name);
    state.push_back(state_name);
    country.push_back(country_name);    
}