我正在尝试为纯粹的教育目的实现一般的解析器库。解析器应该在任意std::istream
上运行。到目前为止,我已经能够使用LL(1)
来回滚所有istream::unget
种语言,以便在前瞻不适合的情况下回滚std::istream
。
但是现在我无法为具有任意长度前瞻的回溯的替代运算符实现回滚。我的第一个想法是继承std::istream
并覆盖istream::get
以将读取的字符存储到临时缓冲区中,以便在解析失败的情况下将它们放回istream::putback
以便我可以将原始std::istream
传递给第二个解析器。但遗憾的是istream::get
不是虚拟方法,因此不起作用。
似乎我必须使用回滚逻辑创建一个自定义std::streambuf
装饰任意其他std::streambuf
- 即使它们本身不支持任何类型的回放 - 但我不确定如何实现这一点,如果这是可能的话。
有人有想法吗?
由于Sam Varshavchik在评论中写道这是正确的方法,我试图实现如下std::streambuf
:
class streambuf_rollback_decorator : public std::streambuf {
private:
std::streambuf* _buffer;
std::vector<char_type> _readChars; // to store read characters
std::vector<char_type> _rollbackBuffer; // move read characters here in case
// there is a rollback that cannot
// be fully put back into the
// underlying streambuf
//
// Chars are stored in reverse
// order so that back can be
// used to retrieve the next char
public:
streambuf_rollback_decorator(std::streambuf* buffer)
: _buffer{buffer} {
}
// notify the streambuf the rollback is not needed so it
// so it can clear the saved characters
void submitReads() {
_readChars.clear();
}
void rollback() {
std::move(_readChars.rbegin(), _readChars.rend(), std::back_inserter(_rollbackBuffer));
_readChars.clear();
}
private:
int_type underflow() override {
if (!_rollbackBuffer.empty()) {
// if there are characters in the rollback buffer take one from there
auto c = traits_type::to_int_type(_rollbackBuffer.back());
_readChars.push_back(c);
_rollbackBuffer.pop_back();
return c;
} else {
// otherwise return a character from the underlying streambuf
char_type c;
auto num_chars = _buffer->sgetn(&c, 1);
if (num_chars == 0) {
return traits_type::eof();
} else {
_readChars.emplace_back(c);
return traits_type::to_int_type(c);
}
}
}
int_type uflow() override {
return underflow();
}
int_type pbackfail(int_type ch) override {
char_type c;
if (ch == traits_type::eof() && !_readChars.empty()) {
c = _readChars.back();
} else if (ch != traits_type::eof()) {
c = traits_type::to_char_type(ch);
} else {
return traits_type::eof();
}
if (!_readChars.empty()) {
_readChars.pop_back();
}
if (_rollbackBuffer.empty()) {
// first try to put the char back into the underlying streambuf
auto succ = _buffer->sputbackc(c);
if (succ == traits_type::eof()) {
// if the char couldn't be put back into the underlying
// streambuf, emplace it into the _rollbackBuffer
_rollbackBuffer.emplace_back(c);
}
} else {
// if there are already characters in the rollback buffer the char
// needs to be emplaced there
_rollbackBuffer.emplace_back(c);
}
return traits_type::to_int_type(c);
}
std::streamsize showmanyc() override {
return _rollbackBuffer.empty() ? 0 : _rollbackBuffer.size();
}
};
此外,我创建了一个std::istream
装饰器,使用此std::streambuf
:
class rollback_istream_decorator : public std::istream {
private:
std::unique_ptr<streambuf_rollback_decorator> _rollback_streambuf;
public:
rollback_istream_decorator(std::istream& s)
: std::istream{s.rdbuf()},
_rollback_streambuf{std::make_unique<streambuf_rollback_decorator>(s.rdbuf())} {
this->rdbuf(_rollback_streambuf.get());
}
rollback_istream_decorator(std::streambuf* buf)
: std::istream{buf},
_rollback_streambuf{std::make_unique<streambuf_rollback_decorator>(buf)} {
this->rdbuf(_rollback_streambuf.get());
}
rollback_istream_decorator& operator=(rollback_istream_decorator const&) = delete;
rollback_istream_decorator(rollback_istream_decorator const&) = delete;
rollback_istream_decorator& operator=(rollback_istream_decorator&&) = default;
rollback_istream_decorator(rollback_istream_decorator&&) = default;
~rollback_istream_decorator() = default;
void submitReads() {
_rollback_streambuf->submitReads();
}
void rollback() {
_rollback_streambuf->rollback();
}
};
它似乎对未格式化的输入正常工作,但遗憾的是格式化的输入被破坏,例如以下简单程序的行为与常规std::cin
的行为不同。
int main() {
rollback_istream_decorator istream{std::cin};
std::string rem;
istream >> rem;
std::cout << rem << '\n';
return 0;
}
在此示例中,必须按两次Enter键才能输入均匀长度,以便查看任何输出,并且只打印每隔一个字符。我做错了什么?