Question

我在目录中有一堆文本文件，我需要读取它们并提取信息并保存在excel或文本文件中

name1_1.txt

count: 10
totalcount: 30
percentage:33
total no of a's: 20
total no of b's: 20
etc...

name2_2.txt

count: 20
totalcount: 40
percentage:50
total no of a's: 10
total no of b's: 30
etc...

等...

输出

             name1        name2
 count        10           20
 totalcount   30           40
 percentage   33           50

我希望输出保存在同一目录中名为（example.txt或.csv）的文件中。我可以帮忙吗？

这里我尝试编写一个shell脚本，但无法创建分隔符并输出到文件我需要的内容

 #$ -S /bin/bash


 for sample in *.txt; do
    header=$(echo ${sample} | awk '{sub(/_/," ")}1'| awk '{print $1}')
    echo -en $header"\t"
 done
 echo -e ' \t '
 echo "count"
 for sample in *.txt; do
    grep "count:" $sample | awk -F: $'\t''{print $2}'
 done
 echo "totalcount"
 for sample in *.txt; do
    grep "totalcount:" $sample | awk -F: $'\t''{print $2}'
 done
 echo "percentage"
 for sample in *.txt; do
    grep "percentage:" $sample | awk -F: $'\t''{print $2}'
 done

Answer 1

你可以看看这是否符合你的要求：

awk -F":" 'BEGIN { DELIM="\t" } \
    last_filename != FILENAME { \
        split( FILENAME, farr, "_" ); header = header DELIM farr[1]; \
        last_filename = FILENAME; i=0 } \
    $1 ~ /count/ || $1 ~ /totalcount/ || $1 ~/percentage/ \
        { a[i++]= NR==FNR ? $1DELIM$2 : a[i]DELIM$2 } \
    END { print header; for( j in a ) { print a[j] } }' name*.txt

我试图将其分成多行以便“更容易”阅读。你可以从每一行中删除尾随的“\”，并连接每一行以重新将其作为一行。如果我再次编辑此anwswer，我只会将其设为可执行awk文件。

awk正在为BEGIN块中的输出设置DELIM。
清除FILENAME并将其附加到标题
它从第一个文件中获取列名，以及数据并将其放入i中的数组中。对于每个下一个文件，它只是附加数据。
在END处输出标题，然后输出数组的内容。

然后我得到以下输出：

        name1   name2
count    10      20
totalcount       20      40
percentage      33      50

现在只会获取数据中指定的列，前提是$1与count，totalcount和percentage完全匹配。

使用shell脚本创建excel文件

1 个答案: