Question

我有一堆乳胶文件使用\input{filename.tex}宏（它的工作方式类似于 C 中的#include），我想解决它们以便我可以输出所有这些都是一个 .tex 文件（该文件必须粘贴在\input{}宏的位置，可以安全地假设每个文件只被引用一次）。

示例：

tesis.tex：

My thesis.
\input{chapter1.tex}
More things
\input{chapter2.tex}

chapter1.tex：

Chapter 1 content.

的chapter2.tex：

Chapter 2 content.
\include{section2-2.tex}

section2-2.tex：

Section 1.

期望的结果应该是：

My thesis.
Chapter 1 content.
More things
Chapter 2 content.
Section 1.

如果只有\input{foo.tex}级别，我可以使用此AWK程序解决此问题：

/\\input\{.*\}/{
    sub(/^[^{]*{/,"",$0)
    sub(/}[^}]*$/,"",$0)
    system("cat " $0)
    next
}

{
    print $0
}

有没有办法在AWK中递归读取文件？

（我愿意接受任何其他语言，但 posix 更好）

谢谢！

Answer 1

这是awk中使用getline在作业的递归函数中的解决方案。我假设chapter2.tex：

Chapter 2 content.
\input{section2-2.tex}

代码：

$ cat program.awk
function recurse(file) {              # the recursive function definition
    while((getline line<file) >0) {   # read parameter given file line by line
        if(line~/^\\input/) {         # if line starts with \input 
            gsub(/^.*{|}.*$/,"",line) # read the filename from inside {}
#           print "FILE: " line       # debug
            recurse(line)             # make the recursive function call
        }
        else print line               # print records without \input
    }
    close(file)                       # after file processed close it
}
{                                     # main program used to just call recurse()
    recurse(FILENAME)                 # called
    exit                              # once called, exit
}

运行它：

$ awk -f program.awk tesis.tex
My thesis.
Chapter 1 content.
More things
Chapter 2 content.
Section 1.

解决方案期望\input位于记录的开头，而不需要任何其他数据。

Answer 2

既然你有标记它也是bash，这样的东西可以在bash中工作，但它没有经过测试：

#!/bin/bash
function texextract {
while read -r line;do    
    if [[ "$line" =~ "input" || "$line" =~ "include" ]];then  #regex may need finetune
      filename="${line: 0:-1}"  #removes the last } from \include{section2-2.tex}
      filename="${filename##*{}" #removes from start up to { ---> filename=section2-2.tex
      texextract "$filename"  #call itself with new args
    else
      echo "$line" >>commonbigfile
    fi
done <"$1" #$1 holds the filename send by caller
return
}

texextract tesis.tex #masterfile

在bash 4.4中（也可能在其他版本中），函数可以调用自身。这是我在这里使用的。

如何在awk中递归读取文件

2 个答案: