我有一系列时间温度数据:
ifile.txt
1921 25
1922 25.1
1923 24.2
1924 23.4
1925 24.4
1926 25.1
1927 23.6
1928 25.2
1929 23.9
1930 25.6
我想计算1923年至1929年期间的异常现象。
我的算法是:
1923 24.2 - (average of the temperatures during 1923-1929)
1924 23.4 - (average of the temperatures during 1923-1929)
1925 24.4 - (average of the temperatures during 1923-1929)
1926 25.1 - (average of the temperatures during 1923-1929)
1927 23.6 - (average of the temperatures during 1923-1929)
1928 25.2 - (average of the temperatures during 1923-1929)
1929 23.9 - (average of the temperatures during 1923-1929)
我的脚本是
mean=$(awk '{if ($1 >= 1923 && $1 <= 1929) sum += $2; count++} END {print count ? (sum/count) : count;sum=count=0}' ifile.txt)
awk '{if ($1 >= 1923 && $1 <= 1929) printf "%4i %5.2f\n", $1, $2-'$mean'}' ifile.txt > ofile.txt
没有打印正确的值。你能查一下我的剧本吗?
答案 0 :(得分:1)
另一种方法,假设日期已分类
awk '/1923/,/1929/ {y[++c]=$1; t[c]=$2; sum+=$2}
END {avg=sum/c;
for(k=1;k<=c;k++) print y[k],t[k]-avg}' file
1923 -0.0571429
1924 -0.857143
1925 0.142857
1926 0.842857
1927 -0.657143
1928 0.942857
1929 -0.357143
您可以修复打印格式。
然而,通过双扫描可以进一步简化
$ awk '/1923/,/1929/{if (NR==FNR) {sum+=$2; c++; avg=sum/c}
else print $1,$2-avg}' file{,}
答案 1 :(得分:1)
@Kay:@try:虽然Karakfa的解决方案很好。该解决方案可以作为替代方案,并且不包含任何数组。
awk 'FNR==NR{f=1;if($1 >= 1923 && $1 <= 1929){count++;SUM+=$2;};next} FNR==1 && f==1{AVG=SUM/count;next} ($1 >= 1923 && $1 <= 1929){print $1, $2-AVG}' Input_file Input_file
EDIT1:现在添加非单线形式的解决方案。
awk 'FNR==NR{
f=1;
if($1 >= 1923 && $1 <= 1929){
count++;
SUM+=$2;
};
next
}
FNR==1 && f==1{
AVG=SUM/count;
next
}
($1 >= 1923 && $1 <= 1929){
print $1, $2-AVG
}
' Input_file Input_file
EDIT2:现在也为解决方案添加解释。以下是出于解释目的,您只能运行上面的代码。
awk 'FNR==NR{ ## Checking condition FNR==NR, which will be only TRUE when first time Input_file is being read. FNR and NR both tells us number of lines of Input_file oinly difference is FNR's value will be RESET whenever a next Input_file is veing read and NR's value will be increasing till all Input_files are read.
f=1; ## making a variable named f's value to 1.
if($1 >= 1923 && $1 <= 1929){ ## Checking condition if $1(first field's) value is graeter than 1923 and lesser than 1929, then do following operations.
count++; ## make a variable named count and increment it each time it satisfy the above condition.
SUM+=$2; ## creating a variable named SUM whose value will be SUM of $2's value and it will add into previous value to get the SUM of all $2's value of all matching lines.
};
next ## next is built-in keyword which will skip the next statements.
}
FNR==1 && f==1{ ## Checking conditions if FNR==1 and f==1, which will be TRUE when first Input_file is read and before 1st line of Input_file is being read.
AVG=SUM/count; ## creating a variable named AVG which will have average by dividing the variable SUM and variable named count.
next ## using next statement to skip all further statements and save a cycle of cpu may be.
}
($1 >= 1923 && $1 <= 1929){ ## Checking condition if $1's value is greater than 1923 and lesser or equal to 1929n then perform following actions.
print $1, $2-AVG ## print the value of $1 and then $2-AVG(as per your request).
}
' Input_file Input_file ## Mentioning the Inpur_file 2 times here.
答案 2 :(得分:1)
你可以通过读取相同的文件两次来实现这一点,第一个读数是计算平均值,第二个读数是计算异常,实际读取两次相同的文件可能很慢,但实际上零内存开销,你不会得到像错误信息out of memory
因为我们在这里没有使用数组。
单行:
awk -v s="1923" -v e="1929" '{f=$1>=s && $1<=e}f && NR==FNR{sum+=$2; c++; next}f{ print $0, $2-(sum/c) }' file file
说明:
awk -v s="1923" -v e="1929" ' # call awk set var s and e
# where s is starting year
# e is ending year
{
f=$1>=s && $1<=e # f holds boolean status whether data is within a range
}
f && NR==FNR{ # if data is within a range
# and we are reading file first time (FNR==NR is true only when awk reads first file), then
sum+=$2; # sum column2 value
c++; # increment counter
next # stop processing go to next line (skipping any code below this line)
}
# Here we read same file second time
f{ # again are we within a range ( f holds boolean status true or false, if true then )
print $0, $2-(sum/c) # print current record/line/row, 2nd field minus average
}' file file
输入:
$ cat file
1921 25
1922 25.1
1923 24.2
1924 23.4
1925 24.4
1926 25.1
1927 23.6
1928 25.2
1929 23.9
1930 25.6
输出
$ awk -v s="1923" -v e="1929" '{f=$1>=s && $1<=e}f && NR==FNR{sum+=$2; c++; next}f{ print $0, $2-(sum/c) }' file file
1923 24.2 -0.0571429
1924 23.4 -0.857143
1925 24.4 0.142857
1926 25.1 0.842857
1927 23.6 -0.657143
1928 25.2 0.942857
1929 23.9 -0.357143
答案 3 :(得分:1)
还有另一种选择:
public function store($location){
if($this->zip->open($location, file_exists($location) ? ZIPARCHIVE::OVERWRITE : ZIPARCHIVE::CREATE)){
foreach($this->files as $file){
$this->count++;
$this->image_name ="OrderImg".$this->count.".png";
$this->set = str_replace('data:image/png;base64,', '', $file);
$this->set = str_replace(' ', '+', $file);
$this->zip->addFile($this->image_name, base64_decode($file));
}
$this->zip->close();
}
}
for循环不保证当前调用的顺序,但如果需要,您只需将其扩展为传统的for循环。