Question

我正在为所有.bed文件编写语法高亮文件。每列的确切内容可能会有所不同，通常如下所示

chr1    11873   14409   uc001aaa.3  0   + 11873 11873   0   3   354,109,1189,   0,739,1347,
chr21   1000000 1230000 peakValue   200 -
chrX    11873   14409   selection
....
<string>  <numeric> <numeric> <string> <numeric 1-1000> <+ or - or .> <numeric> <numeric> <numeric> <numeric> <comma separated list> <comma separated list>

到目前为止，我有第一个列选择和链工作：

bed.lang

<?xml version="1.0" encoding="UTF-8"?>

<language id="bed" _name="Bed" version="2.0" _section="Scientific">
  <metadata>
    <property name="mimetypes">text/bed</property>
    <property name="globs">*.bed</property>
  </metadata>

  <styles>

    <style id="chrom"        _name="Chrom"    map-to="bed:chr" />
    <style id="strand"       _name="Coords"   map-to="bed:strand" />

  </styles>

  <definitions>
    <context id="bed">
      <include>

    <context id="1_chr" style-ref="chrom">
      <match extended="true">
            ^\w+
      </match>
        </context>

    <context id="6_strand" style-ref="strand">
      <match extended="true">
            \t[+\-\.]\t
      </match>
        </context>

      </include>
    </context>
  </definitions>
</language>

我想对此进行扩展，以便根据我可以定义的方案对每列的格式进行不同的格式化。即坐标是一种颜色，名称是另一种颜色，分数是另一种颜色。问题是坐标和分数之类的东西都是数字字符串。

最简单的＆＃39;我能看到的解决方案是一个可以选择列的正则表达式，如果选择更大，那么列数就不会返回任何内容（不会换行）。

反向搜索似乎不起作用（因为正则表达式中的＆＃39;＆＃39;字符。我尝试了一些正则表达式，但表现不好的是：

以不同方式构建迭代匹配和格式化不起作用。多个选择相同的字符串会导致所有语法突出显示失败。
```
^.+?\t
^.+?\t.+?\t
^.+?\t.+?\t.+?\t ...
```

选择＆＃39;数字字符串＆＃39;

Single numeric string
(?<=^\w\t)[0-9]+(?=\t)

Numeric string doublets
(?<=\t)[0-9]+\t[0-9]+(?=\t){1}

我会继续破解一个丑陋的解决方案，但我想知道是否有一些我没想到的优雅。

通过gtksourceview中制表符分隔的列突出显示语法

0 个答案: