如何在awk中递归读取文件

时间:2017-03-28 23:39:36

标签: bash recursion awk posix

我有一堆乳胶文件使用\input{filename.tex}宏(它的工作方式类似于 C 中的#include),我想解决它们以便我可以输出所有这些都是一个 .tex 文件(该文件必须粘贴在\input{}宏的位置,可以安全地假设每个文件只被引用一次)。

示例:

tesis.tex:

My thesis.
\input{chapter1.tex}
More things
\input{chapter2.tex}

chapter1.tex:

Chapter 1 content.

的chapter2.tex:

Chapter 2 content.
\include{section2-2.tex}

section2-2.tex:

Section 1.

期望的结果应该是:

My thesis.
Chapter 1 content.
More things
Chapter 2 content.
Section 1.

如果只有\input{foo.tex}级别,我可以使用此AWK程序解决此问题:

/\\input\{.*\}/{
    sub(/^[^{]*{/,"",$0)
    sub(/}[^}]*$/,"",$0)
    system("cat " $0)
    next
}

{
    print $0
}

有没有办法在AWK中递归读取文件?

(我愿意接受任何其他语言,但 posix 更好)

谢谢!

2 个答案:

答案 0 :(得分:2)

这是awk中使用getline在作业的递归函数中的解决方案。我假设chapter2.tex

Chapter 2 content.
\input{section2-2.tex}

代码:

$ cat program.awk
function recurse(file) {              # the recursive function definition
    while((getline line<file) >0) {   # read parameter given file line by line
        if(line~/^\\input/) {         # if line starts with \input 
            gsub(/^.*{|}.*$/,"",line) # read the filename from inside {}
#           print "FILE: " line       # debug
            recurse(line)             # make the recursive function call
        }
        else print line               # print records without \input
    }
    close(file)                       # after file processed close it
}
{                                     # main program used to just call recurse()
    recurse(FILENAME)                 # called
    exit                              # once called, exit
}

运行它:

$ awk -f program.awk tesis.tex
My thesis.
Chapter 1 content.
More things
Chapter 2 content.
Section 1.

解决方案期望\input位于记录的开头,而不需要任何其他数据。

答案 1 :(得分:0)

既然你有标记它也是bash,这样的东西可以在bash中工作,但它没有经过测试:

#!/bin/bash
function texextract {
while read -r line;do    
    if [[ "$line" =~ "input" || "$line" =~ "include" ]];then  #regex may need finetune
      filename="${line: 0:-1}"  #removes the last } from \include{section2-2.tex}
      filename="${filename##*{}" #removes from start up to { ---> filename=section2-2.tex
      texextract "$filename"  #call itself with new args
    else
      echo "$line" >>commonbigfile
    fi
done <"$1" #$1 holds the filename send by caller
return
}

texextract tesis.tex #masterfile

在bash 4.4中(也可能在其他版本中),函数可以调用自身。这是我在这里使用的。