想要读取第一个字段,然后根据“& - ”和“&& - ”分隔符生成序列。 再次读取第一列,然后使用“上一个非空列值”向下填充“空列值”。
但是实际的输入文件没有用逗号FS =“,”分隔,而标签FS =“\ t”
例如:如果Digits字段为210& -3,则只需要填充210和213。 如果Digits字段是210&& -3,则需要填充210,211,212和213.
INPUT.TXT
DIGITS AL DEST CHI CNT NEDEST CORG NCHA
20 0 ABC 1 N DEFABC 0 CHARGE
1 ABC 1 N GHIABC 0 CHARGE
2 ABC 1 N JKLABC 0 CHARGE
3 ABC 1 N MNOABC 0 CHARGE
4 ABC 1 N PQRABC 0 CHARGE
2130&&-4&-6&&-8 0 ABC 1 N DEFABC 0 CHARGE
1 ABC 1 N GHIABC 0 CHARGE
因此,遵循以下两个步骤来实现所需的输出。
步骤1:读取第一列,然后向下填充空列值,使用上一个非空列值
awk 'a=/^ /{$0=(x)substr($0,length(x)+1)}!a{x=$1}1' Input.txt > Op_Step1.txt
Op_Step1.txt
20 0 ABC 1 N DEFABC 0 CHARGE
20 1 ABC 1 N GHIABC 0 CHARGE
20 2 ABC 1 N JKLABC 0 CHARGE
20 3 ABC 1 N MNOABC 0 CHARGE
20 4 ABC 1 N PQRABC 0 CHARGE
2130&&-4&-6&&-8 0 ABC 1 N DEFABC 0 CHARGE
2130&&-4&-6&&-8 1 ABC 1 N GHIABC 0 CHARGE
步骤2:读取第一个字段,然后根据Op_Step1.txt中的“& - ”和“&& - ”分隔符生成序列
感谢EdMorton提供以下脚本:
$ awk -f tst.awk Op_Step1.txt
由于以上输入未用逗号FS =“,”和制表符FS =“\ t”分隔,以下脚本无法正常工作
BEGIN{ FS="\t" }
{
for (i=1;i<=NF;i++) {
if ($i == "") {
i++
$i = $1 - $i
for (j=(prev+1);j<$i;j++) {
print j
}
}
else if ($i < 0) {
$i = $1 - $i
}
print $i
prev = $i
}
}
期望的输出:
20 0 ABC 1 N DEFABC 0 CHARGE
20 1 ABC 1 N GHIABC 0 CHARGE
20 2 ABC 1 N JKLABC 0 CHARGE
20 3 ABC 1 N MNOABC 0 CHARGE
20 4 ABC 1 N PQRABC 0 CHARGE
2130 0 ABC 1 N DEFABC 0 CHARGE
2131 0 ABC 1 N DEFABC 0 CHARGE
2132 0 ABC 1 N DEFABC 0 CHARGE
2133 0 ABC 1 N DEFABC 0 CHARGE
2134 0 ABC 1 N DEFABC 0 CHARGE
2136 0 ABC 1 N DEFABC 0 CHARGE
2137 0 ABC 1 N DEFABC 0 CHARGE
2138 0 ABC 1 N DEFABC 0 CHARGE
2130 1 ABC 1 N GHIABC 0 CHARGE
2131 1 ABC 1 N GHIABC 0 CHARGE
2132 1 ABC 1 N GHIABC 0 CHARGE
2133 1 ABC 1 N GHIABC 0 CHARGE
2134 1 ABC 1 N GHIABC 0 CHARGE
2136 1 ABC 1 N GHIABC 0 CHARGE
2137 1 ABC 1 N GHIABC 0 CHARGE
2138 1 ABC 1 N GHIABC 0 CHARGE
任何建议,对不起,长篇大论!!!“
更新评论
1 NR==1 || !NF { next } # AVN: To skip header OR Blank Lines
2
3 /^[[:digit:]]/ { # AVN: To find field starts with [0-9]
4 blanks = range = $1 # AVN: Assign if the line begins with [0-9] and doesnt start with blank
# EM: saves the value of $1 in variable "ranges" and also saves it in variable "blanks"
5 gsub(/./," ",blanks) # AVN: To fill the empty field with previous assigned value
# EM: replaces every character in the variable "blanks" with a blank character.
6 $0 = blanks substr($0,length(blanks)+1) # AVN: Not able to understand
# EM: Replaces $1 with a string of the same length but all-blanks so that when we
# later need to change "2130&&-4&-6&&-8" to "2130", "2131", etc. we wont have
# to deal with the original string "2130&&-30&&-4&-6&&-8" still being present in $0.
# Remember we saved the original $1 value in the variable "range" so
# its OK to overwrite the characters in $0 now. We dont simply re-assign
# $1 as that would cause $0 to be recompiled using the current OFS value and
# so destroy all of your original spacing.
7 }
8
9 {
10 split(range,arr,/&/) # AVN: split & and store the values into arr variable
11 for (i=1;i in arr;i++) { # AVN: Looping elements based on arr count
12 if (arr[i] == "") { # AVN: Not able to catch the below Array Logics
# EM: split("2130&&-4&-6&&-8",arr,/&/) populates arr as
# arr[1]=2130, arr[2]="", arr[3]=-4, arr[4]=-6, arr[5]=""; arr[6]="-8"
# That should help you understand the loop logic - if in doubt add prints
# to dump array and other variable values then update your comments.
13 i++
14 for (j=(prev+1);j<(arr[1]-arr[i]);j++) {
15 print j substr($0,length(j)+1)
16 }
17 }
18
19 if (arr[i] < 0) {
20 arr[i] = arr[1] - arr[i]
21 }
22
23 print arr[i] substr($0,length(arr[i])+1)
24 prev = arr[i]
25 }
26 }
答案 0 :(得分:4)
在您从我这里获得的脚本中,不是将FS设置为&
并在字段上循环,而是split($1,arr,/&/)
并循环arr
的元素。
既然您已经付出了努力并且亲自完成并且其余的细节并不完全明显,那么这里就是完整的脚本:
$ cat tst.awk
NR==1 || !NF { next }
/^[[:digit:]]/ {
blanks = range = $1
gsub(/./," ",blanks)
$0 = blanks substr($0,length(blanks)+1)
}
{
split(range,arr,/&/)
for (i=1;i in arr;i++) {
if (arr[i] == "") {
i++
for (j=(prev+1);j<(arr[1]-arr[i]);j++) {
print j substr($0,length(j)+1)
}
}
if (arr[i] < 0) {
arr[i] = arr[1] - arr[i]
}
print arr[i] substr($0,length(arr[i])+1)
prev = arr[i]
}
}
$ cat file
DIGITS AL DEST CHI CNT NEDEST CORG NCHA
20 0 ABC 1 N DEFABC 0 CHARGE
1 ABC 1 N GHIABC 0 CHARGE
2 ABC 1 N JKLABC 0 CHARGE
3 ABC 1 N MNOABC 0 CHARGE
4 ABC 1 N PQRABC 0 CHARGE
2130&&-4&-6&&-8 0 ABC 1 N DEFABC 0 CHARGE
1 ABC 1 N GHIABC 0 CHARGE
$ awk -f tst.awk file
20 0 ABC 1 N DEFABC 0 CHARGE
20 1 ABC 1 N GHIABC 0 CHARGE
20 2 ABC 1 N JKLABC 0 CHARGE
20 3 ABC 1 N MNOABC 0 CHARGE
20 4 ABC 1 N PQRABC 0 CHARGE
2130 0 ABC 1 N DEFABC 0 CHARGE
2131 0 ABC 1 N DEFABC 0 CHARGE
2132 0 ABC 1 N DEFABC 0 CHARGE
2133 0 ABC 1 N DEFABC 0 CHARGE
2134 0 ABC 1 N DEFABC 0 CHARGE
2136 0 ABC 1 N DEFABC 0 CHARGE
2137 0 ABC 1 N DEFABC 0 CHARGE
2138 0 ABC 1 N DEFABC 0 CHARGE
2130 1 ABC 1 N GHIABC 0 CHARGE
2131 1 ABC 1 N GHIABC 0 CHARGE
2132 1 ABC 1 N GHIABC 0 CHARGE
2133 1 ABC 1 N GHIABC 0 CHARGE
2134 1 ABC 1 N GHIABC 0 CHARGE
2136 1 ABC 1 N GHIABC 0 CHARGE
2137 1 ABC 1 N GHIABC 0 CHARGE
2138 1 ABC 1 N GHIABC 0 CHARGE