我正在尝试使用以下语法来分析百灵鸟的多项选择测验。
GRAMMAR = """
start: question choice~3..5
question: QUESTION_NUMBER _QUESTION_NUMBER_SEPARATOR question_body
question_body: LINE+
QUESTION_NUMBER: DIGIT+
_QUESTION_NUMBER_SEPARATOR: WS_INLINE* "." WS_INLINE*
choice.3: CHOICE_NAME ")" choice_body
choice_body: LINE+
CHOICE_NAME.3: ("A" | "B" | "C" | "D" | "E")
LINE: (WORD | PUNCTUATION | WS_INLINE )* NEWLINE
WORD: (LETTER | DIGIT | /[şŞöÖüÜçÇğĞıİâî]/)+
PUNCTUATION: (SEPARATOR | GROUPER | MATHS | OTHER)
SEPARATOR: ("," | "." | ":" | ";" | "?" | "!"| "-")
GROUPER: ("<" | ">" | "[" | "]" | "(" | ")" )
MATHS: ("–" | "+" | "/" | "=" | "÷")
OTHER: /["'_\\\]/
_EOL : WS_INLINE* _NL
_NL : (NEWLINE | /\f/)
%import common.NEWLINE
%import common.LETTER
%import common.DIGIT
%import common.WS_INLINE
"""
parser = lark.Lark(
GRAMMAR,
parser="lalr",
lexer="contextual",
keep_all_tokens=False,
debug=True,
)
问题类似于以下示例:
1. This is a section of question body.
Another part of the question body.
A) Option A
B) Option B
C) Option C
D) Option D
E) Option E
问题主体和选择主体都可能包含多行以及空行。
运行代码时出现以下错误:
lark.exceptions.UnexpectedCharacters: No terminal defined for 'n' at line 3 col 2
Another part of the question body.
^
Expecting: {'RPAR'}
显然,解析器试图处理该部分,就好像它是一个选择一样,并由于与A后面的“)”不匹配而失败。
选择的顺序无关紧要,例如,下一个也由于相同的原因而失败。
1. This is a section of question body.
Be a part of the question body.
A) Option A
B) Option B
C) Option C
D) Option D
E) Option E
给出相同的错误:
lark.exceptions.UnexpectedCharacters: No terminal defined for 'e' at line 3 col 2
Be a part of the question body.
^
Expecting: {'RPAR'}
但是,所有不以“ ABCDE”开头的行都将成功解析为问题正文的一部分。例如,这有效:
1. This is a section of question body.
Second part of the question body.
A) Option A
B) Option B
C) Option C
D) Option D
E) Option E
# program's output is as
{
"question": {
"body": "This is a section of question body.\n\nSecond part of the question body. \n\n",
"number": "1",
},
"choices": (
{"name": "A", "body": " Option A\n"},
{"name": "B", "body": " Option B\n"},
{"name": "C", "body": "Option C\n"},
{"name": "D", "body": " Option D\n"},
{"name": "E", "body": "Option E \n\n"},
),
}
我在语法上做错了什么?