基本上我得到的文件有前三列粘贴,然后是一列空白行,因为看起来没有任何内容被附加到column4
我觉得我可能不应该使用我在命令替换中创建的变量,但我不确定如何访问我需要的这些数字
#!/bin/sh # the first file in the expression of a bunch of patients to be made into data files that can be put into the graph
awk '{print "hs"$1,"\t",$2,"\t",$3}' $1 > temp1.txt #important columns saved
numLines=`wc -l $1`
touch column4.txt #creates a column for the average of column 6-
for ((s=0;s<$numlines;s++)); do
currentRow=0 #Will eventually be the average of column 6- for the row of focus
for ((i=6;i<=106;i++)); do
addition=`cut -f $i $1 | head -n $s | tail -n 1` # cuts out the number at the row and column of focus for this loop
currentRow=`expr $currentRow + $addition` # adding the newly extracted number to the total
done
currentRow=`expr $currentRow / 101` #divides so the number is an average instead of a really big number
echo $currentRow >> column4.txt #appends this current row into a text file that can be pasted onto the first three columns
done
paste temp1.txt column4.txt
rm temp1.txt column4.txt
如果它有助于输入文件非常大(大约106列和数万行),但这里是一个示例
Important identifier line grant regis 76 83 02 38 0 38 29 38 48 (..up to to 106 columns)
another important identifier bill susan 98 389 20 29 38 20 94 29 0 (.. same point)
然后输出看起来像(假设我们排除了......之后的列)
Important identifier line 34.88
another important identifier 79.67
很抱歉,如果有什么不清楚的地方,我会尽力说清楚,只要问一下你是否有什么想知道的事情,我会编辑或评论
感谢您
答案 0 :(得分:0)
awk
救援!
您可以使用此示例输入
中的值替换所有脚本$ awk '{for(i=6;i<=NF;i++) sum+=$i;
printf "%s %s %s %.2f\n", $1,$2,$3, sum/(NF-5);
sum=0}' file
Important identifier line 39.11
another important identifier 79.67
对于中位数(奇数个字段),你可以这样做
$ awk '{for(i=6;i<=NF;i++) a[i-5]=$i;
asort(a);
mid=(NF-4)/2; print mid, a[mid]}' file
5 38
5 29
对于偶数,一般方法是取相邻数的平均值(也可以按距离加权平均值)。
答案 1 :(得分:0)
您可以尝试使用以下内容:
perl -MList::Util=sum -lanE '@n=grep{/^\d+$/}@F; say "@F[0..4] ",sum(@n)/@n'
打印:
Important identifier line grant regis 39.1111111111111
another important identifier bill susan 79.6666666666667
或用于精度
perl -MList::Util=sum -lanE '@n=grep{/^\d+$/}@F; printf "@F[0..4] %.2f\n",sum(@n)/@n'
Important identifier line grant regis 39.11
another important identifier bill susan 79.67
以上计算了该行中所有数值的平均值。准确的6-
可以使用例如:
perl -MList::Util=sum -lanE 'say "@F[0..4] ",sum(@F[5..@F])/(@F-6)'
还打印
Important identifier line grant regis 39.1111111111111
another important identifier bill susan 79.6666666666667
用于打印两者,平均值和中位数(奇数或偶数数量的元素)
perl -MList::Util=sum -lanE '
@s = sort { $a <=> $b } @F[5..@F];
$m = int(@s/2);
printf "@F[0..4] %.2f %d\n",
sum(@s)/(@s-1),
(@s % 2) ? @s[$m] : sum(@s[$m-1,$m])/2
' filename
打印:
Important identifier line grant regis 39.11 38
another important identifier bill susan 79.67 29
最后,与上面相同 - 作为带有好变量的perl脚本。
use strict;
use warnings;
use List::Util qw(sum);
while(<>) {
chomp;
my(@text) = split;
my(@sorted_numbers) = sort { $a <=> $b } grep { /^\d+$/ } splice @text, 5;
my $average = sum(@sorted_numbers)/@sorted_numbers;
my $median;
my $mid = int(@sorted_numbers / 2);
if( @sorted_numbers % 2) {
$median = $sorted_numbers[$mid];
} else {
$median = sum(@sorted_numbers[$mid-1,$mid])/2;
}
printf "@text %.2f %d\n", $average, $median;
}