Question

我使用PegKit构建一个简单的特定领域解释语言。

除了内插字符串之外，我基本上都有其他工作。

这个想法是要实现这样的某种规则：

atom = Number | stringLiteral | referenceType;
stringLiteral = "'"! (~"'" | "{"! expression "}"!)*  "'"!;
referenceType = Word ('.' Word)*;

其中＆＃39;表达＆＃39;生产已经确定。

我在这里插入了一些逻辑，用于根据我需要的标记构建一个字符串。如果我们遇到一个表达式，我会对它进行评估并将其添加到正在构建的字符串中。

原子和参考类型的制作完美解析。

但是，如果我尝试解析类似＆＃39; hello＆＃39;之类的内容，则在运行原子规则时，生成的令牌始终是内置的Word类型。

我尝试用美元符号和其他字符组合替换单引号来表示字符串的开头和结尾，但它从不匹配。

有什么想法吗？

干杯

Answer 1

Creator of PEGKit here.

Are you sure that the erroneous $var = {"Feb 6, 2016":0,"Feb 7, 2016":0,"Feb 8, 2016":7,"Feb 9, 2016":5,"Feb 10, 2016":0,"Feb 11, 2016":0,"Feb 12, 2016":0}; tokens produced are of type 'hello'? I suspect they may actually of type Word… The default behavior of QuotedString would be to produce a PKTokenizer token for any single- or double-quoted string.

To achieve the result you're looking for, you must alter the tokenizerState of QuotedString for the apostrophe (single-quote). By default, this is PKTokenizer, but you will need to change that to PKQuoteState (the tokenizers PKSymbolState property) so that apostrophes are recognized as single-character tokens of type -symbolState instead of the beginning of a multi-character token of type Symbol.

You can do this in an Action at the top of your grammar (or wherever you are configuring your tokenizer):

QuotedString

Now apostrophes will be tokenized as single-character @before { PKTokenizer t = self.tokenizer; [t setTokenizerState:t.symbolState from:'\'' to:'\'']; } tokens.

PegKit字符串插值

1 个答案: