文件名基于Array

时间:2016-09-19 19:46:38

标签: arrays variables unix awk

我创建了一个数组:

declare -A months=( ["JAN"]="AP01" ["FEB"]="AP02" ["MAR"]="AP03" ["APR"]="AP04" ["MAY"]="AP05" ["JUN"]="AP06" ["JUL"]="AP07" ["AUG"]="AP08" ["SEP"]="AP09" ["OCT"]="AP10" ["NOV"]="AP11" ["DEC"]="AP12")

现在我希望在分割文件并创建新文件名时读取月份的替换值:

awk -F, '{print "a~ST_SAP_FILE~Actual~",echo ${months["${"$3":0:3}"]}","~RM.txt"}' ExtractOriginal.txt

发生变量替换的字段是第3列。在那里我有MAR-2016,我期待的是一个名为a~ST_SAP_FILE~Actual~MAR~RM.txt的文件。但是,我收到一个错误:

awk: syntax error near line 1
awk: illegal statement near line 1
awk: syntax error near line 1
awk: bailing out near line 1

第3列的正确语法是什么,将其传递给我的数组,返回Substitution变量并将其用作文件名?

1 个答案:

答案 0 :(得分:0)

您可以通过几种方式解决问题。您选择的主要取决于您希望与awk绑定的方式。

awk

中声明数组

你有没有理由不在awk声明变量?

awk -F, 'BEGIN{months["JAN"]="AP01"; months["FEB"]="AP02"; months["MAR"]="AP03"; months["APR"]="AP04"; months["MAY"]="AP05"; months["JUN"]="AP06"; months["JUL"]="AP07"; months["AUG"]="AP08"; months["SEP"]="AP09"; months["OCT"]="AP10"; months["NOV"]="AP11"; months["DEC"]="AP12"}{print "a~ST_SAP_FILE~Actual~"months[substr($3,0,3)]"~RM.txt"}' ExtractOriginal.txt

(另请注意,我从print删除了逗号,因为这些将添加您的问题似乎表明您不希望在结果中使用的空格)

正如@Ed Morton所指出的,由于数组的性质,我们可以使用split / sprintf简化它的创建,为您提供:

awk -F, 'BEGIN{split("JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC",t," "); for (i in t) months[t[i]]=sprintf("AP%02d",i)}{print "a~ST_SAP_FILE~Actual~"months[substr($3,0,3)]"~RM.txt"}' ExtractOriginal.txt

将变量解析为awk

这似乎与您尝试做的事情最接近。这使数组在bash中可用,但处理awk获取所需的文件名。由于在bash中没有处理awk数组的本机方法,因此必须从前者构造后者(由于这是一个关联数组而变得更加困难)。

我首先将bash数组更改为更容易解析的字符串,然后将其作为变量传递给awk

# Declare the array
declare -A months=( ["JAN"]="AP01" ["FEB"]="AP02" ["MAR"]="AP03" ["APR"]="AP04" ["MAY"]="AP05" ["JUN"]="AP06" ["JUL"]="AP07" ["AUG"]="AP08" ["SEP"]="AP09" ["OCT"]="AP10" ["NOV"]="AP11" ["DEC"]="AP12")

# Change the array into a string more easily parsed with awk
# Each element in this array is of the format MON=APON
mon=`for key in ${!months[@]}; do echo ${key}'='${months[${key}]}; done`

# See below explanation
awk -F, -v mon="$mon" 'BEGIN {split(mon,tmp," "); for(m in tmp){i = index(tmp[m], "="); months[substr(tmp[m], 1, i-1)] = substr(tmp[m], i+1)}} {print "a~ST_SAP_FILE~Actual~"months[substr($3,0,3)]"~RM.txt"}' ExtractOriginal.txt

以下是awk脚本的更易读的版本。请注意-v mon="$mon"bash变量mon传递给awk作为名为mon的变量:

 BEGIN {
  split(mon,tmp," ");            # split the string mon into an array named tmp
  for(m in tmp) {                # for element in tmp
    i = index(tmp[m], "=");      # get the index of the '='
    months[substr(tmp[m], 1, i-1)] = substr(tmp[m], i+1)                           
                      # split the elements of tmp at the '='
                      # and add them into an associative array called months
                      # the value is the part which follows the '='
    } 
} 
{
  print "a~ST_SAP_FILE~Actual~"months[substr($3,0,3)]"~RM.txt"
}

完全跳过awk

另一种选择是根本不使用awk,这消除了使阵列进入可行状态的负担。你的问题不清楚这是否是一个潜在的解决方案,但我个人认为这个bash版本更易于编写/阅读/理解。

#!/usr/bin/env bash

filename="ExtractOriginal.txt"

declare -A months=( ["JAN"]="AP01" ["FEB"]="AP02" ["MAR"]="AP03" ["APR"]="AP04" ["MAY"]="AP05" ["JUN"]="AP06" ["JUL"]="AP07" ["AUG"]="AP08" ["SEP"]="AP09" ["OCT"]="AP10" ["NOV"]="AP11" ["DEC"]="AP12")

while read line; do                              # for line in file
    month_yr=`echo $line | cut -d',' -f3`        # get the third column
    month=${months[${month_yr:0:3}]}             # get first 3 characters
    echo 'a~ST_SAP_FILE~Actual~'$month'~RM.txt'
done <"$filename"