使用stringstream而不是`sscanf`来解析固定格式的字符串

时间:2010-12-28 19:00:44

标签: c++

我想使用stringstream提供的工具从固定格式string中提取值,作为sscanf的类型安全替代方案。我怎么能这样做?

考虑以下特定用例。我有以下固定格式的std::string

YYYYMMDDHHMMSSmmm

其中:

YYYY = 4 digits representing the year
MM = 2 digits representing the month ('0' padded to 2 characters)
DD = 2 digits representing the day ('0' padded to 2 characters)
HH = 2 digits representing the hour ('0' padded to 2 characters)
MM = 2 digits representing the minute ('0' padded to 2 characters)
SS = 2 digits representing the second ('0' padded to 2 characters)
mmm = 3 digits representing the milliseconds ('0' padded to 3 characters)

以前我在这方面做了一些事情:

string s = "20101220110651184";
unsigned year = 0, month = 0, day = 0, hour = 0, minute = 0, second = 0, milli = 0;    
sscanf(s.c_str(), "%4u%2u%2u%2u%2u%2u%3u", &year, &month, &day, &hour, &minute, &second, &milli );

宽度值是幻数,没关系。我想使用流来提取这些值,并为了类型安全而将它们转换为unsigned s。但是当我尝试这个时:

stringstream ss;
ss << "20101220110651184";
ss >> setw(4) >> year;

year保留值0。它应该是2010

我该怎样做我想做的事情?我不能使用Boost或任何其他第三方库,也不能使用C ++ 0x。

6 个答案:

答案 0 :(得分:7)

一个不是特别有效的选择是构造一些临时字符串并使用词法转换:

std::string s("20101220110651184");
int year = lexical_cast<int>(s.substr(0, 4));
// etc.

lexical_cast只需几行代码即可实现; Herb Sutter在他的文章"The String Formatters of Manor Farm."

中提出了最低限度的要求

这并不是你想要的,但它是一种从字符串中提取固定宽度字段的类型安全方法。

答案 1 :(得分:4)

我使用以下内容,它可能对您有用:

template<typename T> T stringTo( const std::string& s )
   {
      std::istringstream iss(s);
      T x;
      iss >> x;
      return x;
   };

template<typename T> inline std::string toString( const T& x )
   {
      std::ostringstream o;
      o << x;
      return o.str();
   }

这些模板需要:

#include <sstream>

用法

long date;
date = stringTo<long>( std::cin );

YMMV

答案 2 :(得分:4)

呃,如果是固定格式,为什么不这样做呢?

  std::string sd("20101220110651184");
  // insert spaces from the back
  sd.insert(14, 1, ' ');
  sd.insert(12, 1, ' ');
  sd.insert(10, 1, ' ');
  sd.insert(8, 1, ' ');
  sd.insert(6, 1, ' ');
  sd.insert(4, 1, ' ');
  int year, month, day, hour, min, sec, ms;
  std::istringstream str(sd);
  str >> year >> month >> day >> hour >> min >> sec >> ms;

答案 3 :(得分:1)

here,您可能会觉得这很有用:

template<typename T, typename charT, typename traits>
std::basic_istream<charT, traits>&
  fixedread(std::basic_istream<charT, traits>& in, T& x)
{
  if (in.width(  ) == 0)
    // Not fixed size, so read normally.
    in >> x;
  else {
    std::string field;
    in >> field;
    std::basic_istringstream<charT, traits> stream(field);
    if (! (stream >> x))
      in.setstate(std::ios_base::failbit);
  }
  return in;
}

setw()仅适用于读取字符串cstrings。上面的函数使用这个事实,读入一个字符串,然后将其转换为所需的类型。您可以将其与setw()ss.width(w)结合使用,以读取任何类型的固定宽度字段。

答案 4 :(得分:0)

template<typename T>
struct FixedRead {
    T& content;
    int size;
    FixedRead(T& content, int size) :
            content(content), size(size) {
        assert(size != 0);
    }
    template<typename charT, typename traits>
    friend std::basic_istream<charT, traits>&
    operator >>(std::basic_istream<charT, traits>& in, FixedRead<T> x) {
        int orig_w = in.width();
        std::basic_string<charT, traits> o;
        in >> setw(x.size) >> o;
        std::basic_stringstream<charT, traits> os(o);
        if (!(os >> x.content))
            in.setstate(std::ios_base::failbit);
        in.width(orig_w);
        return in;
    }
};

template<typename T>
FixedRead<T> fixed_read(T& content, int size) {
    return FixedRead<T>(content, size);
}

void test4() {
    stringstream ss("20101220110651184");
    int year = 0, month = 0, day = 0, hour = 0, min = 0, sec = 0, ms = 0;
    ss >> fixed_read(year, 4) >> fixed_read(month, 2) >> fixed_read(day, 2)
            >> fixed_read(hour, 2) >> fixed_read(min, 2) >> fixed_read(sec, 2)
            >> fixed_read(ms, 4);
    cout << "year:" << year << "," << "month:" << month << "," << "day:" << day
            << "," << "hour:" << hour << "," << "min:" << min << "," << "sec:"
            << sec << "," << "ms:" << ms << endl;
}

答案 5 :(得分:0)

ps5mh的解决方案非常好,但不适用于包含空格的字符串的固定大小解析。以下解决方案解决了这个问题:

template<typename T, typename T2>
struct FixedRead
{
    T& content;
    T2& number;
    int size;
    FixedRead(T& content, int size, T2 & number) :
        content(content), number(number), size(size)
    {
        assert (size != 0);
    }
    template<typename charT, typename traits>
    friend std::basic_istream<charT, traits>&
    operator >>(std::basic_istream<charT, traits>& in, FixedRead<T,T2> x)
    {
        if (!in.eof() && in.good())
        {
            std::vector<char> buffer(x.size+1);
            in.read(buffer.data(), x.size);
            int num_read = in.gcount();
            buffer[num_read] = 0; // set null-termination of string
            std::basic_stringstream<charT, traits> os(buffer.data());
            if (!(os >> x.content))
                in.setstate(std::ios_base::failbit);
            else
                ++x.number;
        }
        return in;
    }
};
template<typename T, typename T2>
FixedRead<T,T2> fixedread(T& content, int size, T2 & number) {
    return FixedRead<T,T2>(content, size, number);
}

这可以用作:

std::string s  = "90007127       19000715790007397";
std::vector<int> ints(5);
int num_read = 0;
std::istringstream in(s);
in >> fixedread(ints[0], 8, num_read) 
   >> fixedread(ints[1], 8, num_read) 
   >> fixedread(ints[2], 8, num_read) 
   >> fixedread(ints[3], 8, num_read) 
   >> fixedread(ints[4], 8, num_read);
// output: 
//   num_read = 4 (like return value of sscanf)
//   ints = 90007127, 1, 90007157, 90007397
//   ints[4] is uninitialized