多个文件的平均值,不考虑不同类型的缺失值

时间:2015-11-09 02:43:56

标签: shell awk

我想计算15个文件的平均值: - ifile1.txt,ifile2.txt,.....,ifile15.txt。每个文件的列数和行数相同,但具有不同类型的缺失值(例如?, - 9999& 8888)。部分数据显示为

$myXMLData =
"<?xml version='1.0' encoding='UTF-8'?>
<root>
<to>tomail@contoso.com</to>
<from>frommail@contoso.com</from>
<subject>Reminder</subject>
<html>Don't forget me this weekend!</html>
<text>text content</text>
</root>";

$xml = simplexml_load_string($myXMLData) or die("Error: Cannot create object");
print_r($xml);

$url = 'https://api.sendgrid.com/';
$user = 'username';
$pass = 'password';

$params = array(
                'api_user' => $user,
                'api_key' => $pass,
                'to' => $xml->{'to'},
                'subject' => $xml->{'subject'},
                'html' => $xml->{'html'},
                'text' => $xml->{'text'},
                'from' => $xml->{'from'},
);

$request = $url . 'api/mail.send.json';

// Generate curl request
$session = curl_init($request);

// Tell curl to use HTTP POST
curl_setopt($session, CURLOPT_POST, true);
curl_setopt($session, CURLOPT_SSL_VERIFYPEER, false);
// Tell curl that this is the body of the POST
curl_setopt($session, CURLOPT_POSTFIELDS, $params);

// Tell curl not to return headers, but do return the response
curl_setopt($session, CURLOPT_HEADER, false);
curl_setopt($session, CURLOPT_SSLVERSION, 6);
curl_setopt($session, CURLOPT_RETURNTRANSFER, true);
// obtain response
$response = curl_exec($session);
curl_close($session);

// print everything out
echo ($response);

我想找一个新文件,它会显示这15个文件的平均值而不考虑缺失值。

 ifile1.txt             ifile2.txt               ifile3.txt
 2  8888   ?     ? .    1  2     1     3    .    5  ?  ?  ? .
 1  -9999  8888  ? .    1  8888  8888  8888 .    5  ?  ?  ? .
 4  6      5     2 .    2  5     5     1    .    3  4  3  1 .
 5  5      7     1 .    0  0     1     1    .    4  3  4  0 .
 .  .      .     . .    .  .     .     .    .    .  .  .  . .  

此问题类似于我之前的问题Average of multiple files without considering missing values

我尝试了以下内容,但没有得到理想的结果。

 ofile.txt
 2.66     2        1         3      . (i.e. average of 2 1 5, average of 8888 2 ? and so on)
 2.33     -9999    -9999    -9999   .
 3        5        4.33      1.33   .
 3        2.66     4         0.66   .
 .      .     .    .      .

1 个答案:

答案 0 :(得分:2)

awk '
{
    for (i=1; i<=NF; i++) {
        if ($i !~ /^([?]|-9999|8888)$/) {
            Count[FNR,i]++
            Sum[FNR,i]+=$i
        }
  }
}
END {
   for (i=1; i<=FNR;i++){
       for (j=1; j<=NF; j++)
           printf "%12.2f ", Count[i,j]!=0 ? Sum[i,j]/Count[i,j] : -9999
       print ""
   }
}
' ifile*.txt

这会产生:

    2.67         2.00         1.00         3.00 
    2.33     -9999.00     -9999.00     -9999.00 
    3.00         5.00         4.33         1.33 
    3.00         2.67         4.00         0.67