我正在尝试从默认词法分析器开始为LaTeX编写自定义Pygments词法分析器。到目前为止,我有
from pygments.lexer import RegexLexer, include
from pygments.token import *
# We rip off the default styles.
Normaltext = Generic.Inserted
Mathtext = Generic.Deleted
Controlsequence = Keyword
Controlword = Controlsequence
Controlsymbol = Keyword.Pseudo
Specialchar = Name.Builtin
class MyTexLexer(RegexLexer):
"""
Custom lexer for the TeX and LaTeX typesetting languages.
"""
tokens = {
'general': [
(r'%.*?\n', Comment),
(r'\\[a-zA-Z]+', Controlword),
(r'\\.', Controlsymbol),
(r'\\$', Controlsymbol),
(r'[&_^{}]', Specialchar),
],
'root': [
(r'\$\$', Specialchar, 'displaymath'),
(r'\\\(', Controlsymbol, 'inlinemath'),
(r'\$', Specialchar, 'inlinemath'),
(r'\\\[', Controlsymbol, 'displaymath'),
include('general'),
(r'[^\\$%&_^{}]+', Normaltext),
],
'math': [
include('general'),
(r'[^\\$%&_^{}]+', Mathtext),
],
'inlinemath': [
(r'\\\)', Controlsymbol, '#pop'),
(r'\$', Specialchar, '#pop'),
include('math'),
],
'displaymath': [
(r'\\\]', Controlsymbol, '#pop'),
(r'\$\$', Specialchar, '#pop'),
include('math'),
],
}
现在,TeX在类别代码方面有些特殊,因此可以在编译期间更改语言。我不需要需要所有这些支持。
我经常要支持的一件事是使用\makeatletter
和\makeatother
,它们基本上允许或禁止在控制字名称中使用@
。我想将控制字的正则表达式有效地更改为r'\\[a-zA-Z@]+'
,只要找到\makeatletter
,找到r'\\[a-zA-Z]+'
时就回到\makeatother
。
有没有办法以这种方式动态改变类的行为?
我不确定是否需要一个TeX示例,但是可以肯定的是,这应该阐明所需的行为:
% Here, \relax is a macro name and @or@do@something is just text:
\relax@or@do@something
\makeatletter
% Here, \relax@or@do@something is the macro name:
\relax@or@do@something
\makeatother
% Here, it's like in the first case:
\relax@or@do@something