Question

我有一个乳胶生成的.toc文件，其中包含大文档的目录。我想将TOC提取到（github-）降价列表，例如与潘多克。

e.g。我有

\contentsline {chapter}{\numberline {1}Introduction}{1}{chapter.1}
\contentsline {section}{\numberline {1.1}Aim and Motivation}{1}{section.1.1}
\contentsline {section}{\numberline {1.2}State of the art}{1}{section.1.2}
\contentsline {section}{\numberline {1.3}Outline}{1}{section.1.3}
\contentsline {chapter}{\numberline {2}Fundamentals}{2}{chapter.2}
...

在我的.toc文件中。

想得到这样的东西

1. Introduction
  1.1. Aim and Motivation
  1.2. State-of-the-art
  1.3. Outline
2. Fundamentals

另一种替代方法是直接从tex文件中提取此信息（没有内容）。但是，我无法让这个工作，我也认为它更容易出错。

有什么建议吗？

Answer 1

另一种替代方法是直接从tex文件中提取此信息。

Pandoc可以做到这一点：

$ pandoc -s --toc input.tex -o output.md

要排除文档正文内容，您必须使用自定义pandoc降价模板：

$ pandoc -D markdown > mytemplate.md

修改mytemplate.md以保留 $toc$ 并移除 $body$ ，然后使用pandoc --template mytemplate.md ...

如果你想更多地自定义它，我建议输出到html（pandoc -t html）而不是markdown，然后编写一个遍历html DOM的小脚本并进行编号等。

Answer 2

不幸的是，在我的情况下，Pandoc 创建了一个空的 Markdown 文件。我创建了一个开源 cli 工具，用于执行该转换： https://github.com/MaaxGr/latex-toc-markdown

下载二进制文件（参见 GitHub Page 的 README）并执行以下命令：

./latex-toc-markdown Input.toc Output.toc

输出文件将如下所示：

* 1 Introduction
  * 1.1 Aim and Motivation
  * 1.2 State of the art
  * 1.3 Outline
* 2 Fundamentals

生成乳胶.toc文件中markdown的目录

2 个答案: