我编写了一个bash脚本,用于计算输入文件每列的统计平均值和中位数。输入文件格式如下所示。每个数字由制表符分隔。
1 2 3
3 2 8
3 4 2
我的方法是首先转置矩阵,使行成为列,反之亦然。转置矩阵存储在临时文本文件中。然后,我计算了行的平均值和中位数。但是,脚本给了我错误的输出。首先,保持每列平均值和中值的数组只产生一个输出。其次,计算的中值不正确。
经过一些代码检查和测试后,我发现虽然转置矩阵确实已写入文本文件,但脚本无法正确读取。具体来说,每行只读一个数字。以下是我的剧本。
#if column is chosen instead
89 if [[ $initial == "-c" ]]
90 then
91 echo "Calculating column stats"
92
93 #transpose columns to row to make life easier
94 WORD=$(head -n 1 $filename | wc -w); #counts the number of columns
95 for((index=1; index<=$WORD; index++)) #loop it over the number of columns
96 do
97 awk '{print $'$index'}' $filename | tr '\n' ' ';echo; #compact way of performing a row-col transposition
98 #prints the column as determined by $index, and then translates new-line with a tab
99 done > tmp.txt
100
101 array=()
102 averageArray=()
103 medianArray=()
104 sortedArray=()
105
106 #calculate average and median, just like the one used for rows
107 while read -a cols
108 do
109 total=0
110 sum=0
111
112 for number in "${cols[@]}" #for every item in the transposed column
113 do
114 (( sum += $number )) #the total sum of the numbers in the column
115 (( total++ )) #the number of items in the column
116 array+=( $number )
117 done
118
119 sortedArray=( $( printf "%s\n" "${array[@]}" | sort -n) )
120 arrayLength=${#sortedArray[@]}
121 #echo sorted array is $sortedArray
122 #based on array length, construct the median array
123 if [[ $(( arrayLength % 2 )) -eq 0 ]]
124 then #even
125 upper=$(( arrayLength / 2 ))
126 lower=$(( (arrayLength/2) - 1 ))
127 median=$(( (${sortedArray[lower]} + ${sortedArray[upper]}) / 2 ))
128 #echo median is $median
129 medianArray+=$index
130 else #odd
131 middle=$(( (arrayLength) / 2 ))
132 median=${sortedArray[middle]}
133 #echo median is $median
134 medianArray+=$index
135 fi
136 averageArray+=( $((sum/total)) ) #the final row array of averages that is displayed
137
138 done < tmp.txt
139 fi
感谢您的帮助。