我的输出与此类似
No Type Pid Status Cause Start Rstr Err Sem Time Program Cl User Action Table
-------------------------------------------------------------------------------------------------------------------------------
0 DIA 10897 Wait yes no 0 0 0 NO_ACTION
1 DIA 10903 Wait yes no 0 0 0 NO_ACTION
2 DIA 10909 Wait yes no 0 0 0 NO_ACTION
3 DIA 10916 Wait yes no 0 0 0 NO_ACTION
4 DIA 10917 Wait yes no 0 0 0 NO_ACTION
5 DIA 9061 Wait yes no 1 0 0 NO_ACTION
但是我希望此表以逗号分隔,没有值的字段应打印为null而不是获取下一列的输出! 目前,我收到以下输出。
NO=0,Type=DIA,Pid=10897,Status=Wait,Cause=yes,Start=no,Rstr=0,Err=0,Sem=0,Time=NO_ACTION,Program=,Cl=,User=,Action=,Table=
NO=1,Type=DIA,Pid=10903,Status=Wait,Cause=yes,Start=no,Rstr=0,Err=0,Sem=0,Time=NO_ACTION,Program=,Cl=,User=,Action=,Table=
NO=2,Type=DIA,Pid=10909,Status=Wait,Cause=yes,Start=no,Rstr=0,Err=0,Sem=0,Time=NO_ACTION,Program=,Cl=,User=,Action=,Table=
NO=3,Type=DIA,Pid=10916,Status=Wait,Cause=yes,Start=no,Rstr=0,Err=0,Sem=0,Time=NO_ACTION,Program=,Cl=,User=,Action=,Table=
NO=4,Type=DIA,Pid=10917,Status=Wait,Cause=yes,Start=no,Rstr=0,Err=0,Sem=0,Time=NO_ACTION,Program=,Cl=,User=,Action=,Table=
NO=5,Type=DIA,Pid=9061,Status=Wait,Cause=yes,Start=no,Rstr=1,Err=0,Sem=0,Time=NO_ACTION,Program=,Cl=,User=,Action=,Table=
我已经编写了一个脚本来执行相同的操作,但是不包括具有空值的列。
#!/bin/bash
sed 1,5d test.txt > temp.txt
input="temp.txt"
while IFS= read -r line
do
echo $line | awk 'BEGIN{FS=" ";OFS=","}{print "NO="$1,"Type="$2,"Pid="$3,"Status="$4,"Cause="$5,"Start="$6,"Rstr="$7,"Err="$8,"Sem="$9,"Time="$10,"Program="$11,"Cl="$12,"User="$13,"Action="$14,"Table="$15;}'
#echo "$line"
done < "$input"
答案 0 :(得分:1)
我对awk
并没有经验,显然可以使任务更快,更短。
尽管可以使用bash
脚本来完成此操作,如下所示:
if [ "$#" -ne "2" ]
then
echo "usage: <$0> input_file output_file"
exit 1
fi
#input table file
input_file=$1
output_file=$2
#Get name for a temporary file by mktemp
temp_file=`mktemp headings_XXXXXX`
#Store all headings separated by '\n' in a temporary file
sed -n '1p' $input_file | tr -s ' ' '\n' > $temp_file
headings=$(sed -n '1p' $input_file)
counter=0
#This loop would extract width of each column so that they can be given to cut as parameters
# like `cat filename | cut -b 3-8` would extract the entries in that column
while [ 1 ]
do
upper_limit=${#headings}
headings=${headings% [! ]*}
lower_limit=${#headings}
if [ "$upper_limit" = "$lower_limit" ]
then
limits_for_cut[$counter]=$(echo "1-${upper_limit}")
counter=$( expr $counter + 1 )
break
fi
lower_limit=$( expr $lower_limit + 1 )
limits_for_cut[$counter]=$(echo "${lower_limit}-${upper_limit}")
counter=$( expr $counter + 1 )
done
end_index=$( expr $counter - 1 )
no_of_lines=$( cat $input_file | wc -l )
no_of_lines=$( expr $no_of_lines - 2 ) #first 2 lines in file are for headings and dashes
on_line=$no_of_lines
#This loop will output all data to the specified file as comma separated
while [ $on_line -ne 0 ]
do
counter=$end_index
cat $temp_file |
while read heading
do
tmp=$( expr $no_of_lines - $on_line + 1 + 2 )
echo "${heading}=`sed -n "${tmp}p" $input_file | cut -b ${limits_for_cut[$counter]} | sed 's/ //g'`," >> $output_file
if [ $counter -eq 0 ]
then
break
fi
counter=$( expr $counter - 1 )
done
on_line=$( expr $on_line - 1 )
done
echo `cat $output_file | tr -d '\n'` > $output_file
rm $temp_file
基本上,我们使用cut
命令。
就像位于3-8
之间的标头“ type”一样,我们可以像这样简单地进行操作
cut -b 3-8 filename
。
我在OSX
上运行了它。您可能需要更改cut
和sed syntax
以适合您的计算机。
如果此解决方案适合您,则应使用awk
尝试同样的方法,因为这样会使它变得更快和更短。
答案 1 :(得分:1)
基于awk
,您可以轻松地实现此目的,方法是计算字段的长度(使用2首行计算),然后从当前行中检索substr
ings。
这是一个可以满足您需求的提案,该提案可以解析输入中的标头(一次只能处理一个文件)
# FIELD array to store start/len for each field
# --- Functions from https://stackoverflow.com/a/27158086/5868851
function ltrim(s) { sub(/^[ \t\r\n]+/, "", s); return s }
function rtrim(s) { sub(/[ \t\r\n]+$/, "", s); return s }
function trim(s) { return rtrim(ltrim(s)); }
# --- Header parsing BEGIN
NR == 1 {
for (i = 1; i < NF; ++i) {
field_len = index($0,$(i+1)) - 1 - total
FIELD[i, "start"] = total
FIELD[i, "len"] = field_len
FIELD[i, "name"] = $i
total += field_len
}
last_field = $NF
}
NR == 2 {
# Last field is of len length($0) - total
FIELD[i, "start"] = total
FIELD[i, "len"] = length($0) - total
FIELD[i, "name"] = last_field
FIELD_N = i
}
# --- Header parsing END
# --- Data parsing BEGIN
NR > 2 {
sep=","
for(i = 1; i <= FIELD_N; ++i) {
value = trim(substr($0, FIELD[i, "start"], FIELD[i, "len"]))
if (!value)
value="null"
if (i == FIELD_N)
sep="\n"
printf("%s=%s%s", FIELD[i, "name"], value, sep);
}
}
# --- Data parsing END