awk根据其他列的条件创建新变量

时间:2018-03-28 14:41:21

标签: bash if-statement awk multiple-columns

我有一个文件Z0 Z1 Z2 0.9746 0.0254 0.0000 0.0032 0.0000 0.9433 0.2464 0.5603 0.9008 0.4034 0.4982 0.0069 0.0072 0.9996 0.0472 ... ... ... (~100' 000行)具有以下结构:

output.txt

我想基于以下条件创建一个名为SCORE的新列output.txt新文件:

  • SCORE = 1如果:0.17≤Z0≤0.33且0.40≤Z1≤0.60
  • SCORE = 2如果:0.40≤Z0≤0.60且0.40≤Z1≤0.60
  • SCORE = 3如果:Z0≤0.05且Z1≥0.95且Z2≤0.05
  • SCORE = 4如果:Z0≤0.05且Z1≤0.05且Z2≥0.95
  • 如果其他4个条件不适用,则SCORE = 5。

Z0 Z1 Z3 SCORE 0.9746 0.0254 0.0000 5 0.0032 0.0000 0.9433 5 0.2464 0.5603 0.9008 1 0.4034 0.4982 0.0069 2 0.0072 0.9996 0.0472 3 ... ... ... 看起来像这样:

awk 'NR==1{$4="SCORE";print;next} \
  0.17<=$1 && $1<=0.33 && 0.40<=$2 && $2<=0.60 {$4="1"} \
  0.40<=$1 && $1<=0.60 && 0.40<=$2 && $2<=0.60 {$4="2"} \
  $1<=0.05 && $2>=0.95 && $3<=0.05 {$4="3"} \
  $1<=0.05 && $2<=0.05 && $3>=0.95 {$4="4"} \
  *other* 1' input.txt > output.txt

以下是我的尝试:

std::vector<int> v(27);
std::iota(v.begin(),v.end(),1);

Eigen::TensorMap<Eigen::Tensor<int,3>> mapped(v.data(), 3, 3, 3 );

Eigen::array<long,3> startIdx = {0,0,0};       //Start at top left corner
Eigen::array<long,3> extent = {2,2,2};       // take 2 x 2 x 2 elements 

Eigen::Tensor<int,3> sliced = mapped.slice(startIdx,extent);

std::cout << sliced << std::endl;

然而,前5个命令行出了问题,我不知道如何在最后一行写下最后一个条件(得分5)。

3 个答案:

答案 0 :(得分:1)

那样的东西?

NR==1{$4="SCORE";print;next}
0.17<$1  && $1<0.33  && 0.40<$2  && $2<0.60 {print $0, "1";next}
0.40<$1  && $1<0.60  && 0.40<$2  && $2<0.60 {print $0, "2";next}
$1<=0.05 && $2>=0.95 && $3<=0.05            {print $0, "3";next}
$1<=0.05 && $2<=0.05 && $3>=0.95            {print $0, "4";next}
{print $0, "5"}

(输出的第2行可能有错误,因为0.9433小于0.95)

答案 1 :(得分:1)

不需要反斜杠并引用数字...... 您描述的条件与代码不符(<= vs <)。

$ awk 'NR==1{print $0,"SCORE"; next} 
  {score=5}
  0.17<$1 && $1<0.33 && 0.40<$2 && $2<0.60 {score=1}
  0.40<$1 && $1<0.60 && 0.40<$2 && $2<0.60 {score=2}
  $1<=0.05 && $2>=0.95 && $3<=0.05         {score=3}
  $1<=0.05 && $2<=0.05 && $3>=0.95         {score=4}  
  {print $0,score}' file | column -t

Z0      Z1      Z3      SCORE
0.9746  0.0254  0.0000  5
0.0032  0.0000  0.9433  5
0.2464  0.5603  0.9008  1
0.4034  0.4982  0.0069  2
0.0072  0.9996  0.0472  3

你的第二行预期分数也是错误的。

或者,可能合并相同的条件

$ awk 'NR==1{print $0,"SCORE"; next} 
  {score=5}
  0.40<$2 && $2<0.60 {if(0.17<$1 && $1<0.33)   score=1;
                      if(0.40<$1 && $1<0.60)   score=2}
  $1<=0.05           {if($2>=0.95 && $3<=0.05) score=3;
                      if($2<=0.05 && $3>=0.95) score=4}  
  {print $0,score}' file | column -t

答案 2 :(得分:0)

我相信您显示的输出不正确请检查一次,请您试试,请告诉我这是否对您有帮助。

awk '
function check(line,Z0,Z1,Z3){
  if(0.17 <= Z0 <= 0.33 && 0.40 <= Z1 <= 0.60)           {   score=1  }
  else if(0.40 <= Z0 <= 0.60 && 0.40 <= Z1 <= 0.60)      {   score=2  }
  else if(Z0 <= 0.05 && Z1 >= 0.95 && Z3 <= 0.05)        {   score=3  }
  else if(Z0 <= 0.05 && Z1 <= 0.05 && Z3 >= 0.95)        {   score=4  }
  else                                                   {   score=5  }
  return score}
FNR==1{
  print;
  next}
{
  check($0,$1,$2,$3);
  print $0,score
}
' input.txt | column -t