Question

假设我用逗号分隔两个双打来解析它们的总和。我可以在Haskell中执行以下操作：

import Data.Attoparsec.Text
import Data.Text (pack)
dblParse = (\a -> fst a + snd a) <$> ((,) <$> double <* char ',' <*> double)
parseOnly dblParse $ pack "1,2"

parseOnly语句将产生(Right 3)::Either String Double - 其中，Haskell经常处理错误。

您可以了解其工作原理 - (,) <$> double <*> double生成Parser (Double,Double)，应用(\a -> fst a + snd a)使其成为Parser Double。

我试图在齐做同样的事情，但当我期望回到3时，我实际上回到了1：

namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;

namespace phx = boost::phoenix;

struct cat
{
    double q;
};

BOOST_FUSION_ADAPT_STRUCT(cat, q)
BOOST_FUSION_ADAPT_STRUCT(cat, q)
template <typename Iterator>
struct cat_parser : qi::grammar<Iterator, cat()>
{
    cat_parser() : cat_parser::base_type(start)
    {
        using qi::int_;
        using qi::double_;
        using qi::repeat;
        using qi::eoi;
        using qi::_1;
        double a;
        start %= double_[phx::ref(a) =_1] >> ',' >> double_[a + _1];
    }
    qi::rule<Iterator, cat()> start;
};

int main()
    {

        std::string wat("1,2");
        cat_parser<std::string::const_iterator> f;
        cat example;
        std::string::const_iterator st = wat.begin();
        std::string::const_iterator en = wat.end();
        std::cout << parse(st, en, f, example) << std::endl;
        std::cout << example.q << std::endl;
        return 0;
}

我的问题有两个：这是用Spirit做这个的惯用方法，为什么我得到1而不是3？

Answer 1

首先是快速回答

为什么我得到1而不是3？

你可能会得到1，因为这是暴露的属性.³

但是，由于未定义的行为，您无法推断您的代码。

您的语义行为

调用UB：您分配给a，其生命周期在解析器构造函数的末尾结束。这是随机内存损坏
无效：操作[a+_1]是一个表达式，会产生一个临时值，即 /whatever is at the memory location that used to hold the local variable和at the time of parser construction/ 的总和主题解析器（double_）公开的属性。在这种情况下，它将是“？+2.0”，但它根本不重要，因为结果没有做任何事情：它只是被丢弃了。

正常答案

要求只是：

假设我用逗号分隔两个双打来解析返回它们总和

以下是我们如何做到的：

double parseDoublesAndSum(std::istream& is) {
    double a, b; char comma;
    if (is >> a >> comma && comma == ',' && is >> b)
        return a + b;

    is.setstate(std::ios::failbit);
    return 0;
}

查看 Live On Coliru 。

是的，但是使用Spirit

我明白了：）

嗯，首先，我们发现暴露的属性是双重的，而不是列表。

下一步是要意识到列表中的各个元素不感兴趣。我们可以将结果初始化为0并使用它来累积元素¹，例如：

<强> Live On Coliru

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>

double parseDoublesAndSum(std::string const& source) {
    double result = 0;

    {
        using namespace boost::spirit::qi;
        namespace px = boost::phoenix;

        bool ok = parse(source.begin(), source.end(), double_ [ px::ref(result) += _1 ] % ',');
        if (!ok)
            throw std::invalid_argument("source: expect comma delimited list of doubles");
    }

    return result;
}

void test(std::string input) {
    try {
        std::cout << "'" << input << "' -> " << parseDoublesAndSum(input) << "\n";
    } catch (std::exception const& e) {
        std::cout << "'" << input << "' -> " << e.what() << "\n";
    }
}
int main() {
    test("1,2");
    test("1,2,3");
    test("1,2,3");
    test("1,2,inf,4");
    test("1,2,-inf,4,5,+inf");
    test("1,2,-NaN");
    test("1,,");
    test("1");
    test("aaa,1");
}

打印

'1,2' -> 3
'1,2,3' -> 6
'1,2,3' -> 6
'1,2,inf,4' -> inf
'1,2,-inf,4,5,+inf' -> -nan
'1,2,-NaN' -> -nan
'1,,' -> 1
'1' -> 1
'aaa,1' -> 'aaa,1' -> source: expect comma delimited list of doubles

高级事物：

哇，“1 ,,”不应该解析！

它没有:)我们已经制定了解析器，不要期望使用完整的输入，修复：追加>> eoi：
```
bool ok = parse(source.begin(), source.end(), double_ [ px::ref(result) += _1 ] % ',' >> eoi);
```
现在打印相关的测试用例
```
'1,,' -> '1,,' -> source: expect comma delimited list of doubles
```
如果我们希望诊断提到输入结束（eoi）是预期的，该怎么办？设为an expectation point > eoi：
```
bool ok = parse(source.begin(), source.end(), double_ [ px::ref(result) += _1 ] % ',' > eoi);
```
现在打印
```
'1,,' -> '1,,' -> boost::spirit::qi::expectation_failure
```
可以通过处理该异常类型来改进：

<强> Live On Coliru

打印
```
'1,,' -> Expecting <eoi> at ',,'
```

接受空间怎么样？

只需使用允许phrase_parses.²之外的船长的lexeme：

bool ok = phrase_parse(source.begin(), source.end(), double_ [ px::ref(result) += _1 ] % ',' > eoi, blank);

现在基元之间忽略了所有blank：

test("   1, 2   ");

打印

'   1, 2   ' -> 3

如何将其打包为rule？

就像我提到的那样，意识到你可以使用规则的公开属性作为累加器寄存器：

namespace Parsers {
    static const qi::rule<iterator, double(), qi::blank_type> product
        = qi::eps [ qi::_val = 0 ] // initialize
        >> qi::double_ [ qi::_val += qi::_1 ] % ','
        ;
}

<强> Live On Coliru

打印与之前相同的结果

¹请记住，总和是一个有趣的主题，http://www.partow.net/programming/sumtk/index.html

²原始解析器是隐含的词法，lexeme[]指令禁止跳过，而没有船长声明的规则是隐含的词法：Boost spirit skipper issues

³PS。这里有一个微妙的发挥。如果你没有写%=而只是=，那么价值将是不确定的：http://www.boost.org/doc/libs/1_65_1/libs/spirit/doc/html/spirit/qi/reference/nonterminal/rule.html#spirit.qi.reference.nonterminal.rule.expression_semantics

在Qi中对解析器公开的属性应用操作

1 个答案:

首先是快速回答

正常答案

是的，但是使用Spirit

高级事物：