Question

我有一个文件，试图从AWK中删除客户名称。该文件是固定宽度的文件，每一列都有含义。

该文件由许多行组成，所有行都具有相同的格式，与以下内容非常相似：

1234-123   123456 12345678901234CUSTOMER NAME TO REMOVE12345-1234 TRN   123-123   12345678901-1234  TRN 12345678        
1234-123   123456 12345678901234CUSTOMER NAME TO REMOVE12345-1234 TRN   123-123   12345678901-1234  TRN 12345678        
1234-123   123456 12345678901234CUSTOMER NAME TO REMOVE12345-1234 TRN   123-123   12345678901-1234  TRN 12345678        
1234-123   123456 12345678901234CUSTOMER NAME TO REMOVE12345-1234 TRN   123-123   12345678901-1234  TRN 12345678

这是我需要用虚构名称交换的客户名称，以便所需的输出为：

1234-123   123456 12345678901234SENTINAL PRIME         12345-1234 TRN   123-123   12345678901-1234  TRN 12345678        
1234-123   123456 12345678901234OPTIMUS PRIME          12345-1234 TRN   123-123   12345678901-1234  TRN 12345678        
1234-123   123456 12345678901234BUMBLE BEE             12345-1234 TRN   123-123   12345678901-1234  TRN 12345678        
1234-123   123456 12345678901234IRON HIDE              12345-1234 TRN   123-123   12345678901-1234  TRN 12345678

我有一个为此要使用的变压器名称列表，存储在一个名为transformer.names的文件中。

SENTINEL PRIME
OPTIMUS PRIME
BUMBLEBEE
IRONHIDE

但是，要使原始文件的每一行保持相同的宽度，我需要用空格正确填充转换器名称，因为我拥有的转换器名称都是不同的长度。

使用AWK似乎可以将这些名称正确填充一定长度，但是我还没有弄清楚（或找到足够清晰的答案）让我理解。

下面是我当前的AWK脚本。

#!/usr/bin/awk -f
BEGIN {
}
{
  getline line < "transformer.names"
  print substr($0, 0, 30) line substr($0, 62, 120)
}

我使用以下命令运行它：

my_program.awk my-file.txt

我想我可以在上面的打印行中包含类似这样的行，但是我还没有设法使其工作。

printf "-%32s|", substr($0, 0, 30) line substr($0, 62, 120)

任何提示都太棒了！

Answer 1

能否请您尝试以下操作，如果有帮助，请告诉我。因此它将具有所有转换器名称，并且假设其值小于Input_file行，那么它将阻止打印行从其开始。

awk '
FNR==NR{
  a[FNR]=$0;
  count=FNR;
  next}
{
  val=val==count?1:++val;
  print substr($0,1,32) a[val]"\t\t"substr($0,56)
}' transformer.names  Input_file

说明： 现在也为上述代码添加了说明。

awk '
FNR==NR{                                          ##Checking condition here FNR==NR which will be TRUE when first Input_file is being read.
  a[FNR]=$0;                                      ##Creating an array named a whose index is FNR and value is current line.
  count=FNR;                                      ##Creating variable count whose value is FNR value(current line number value of first Input_file).
  next}                                           ##next will skip further statements from here onward.
{                                                 ##This block will execute when 2nd Input_file is being read.
  val=val==count?1:++val;                         ##Creating variable val whose value is increment each time and when it is equal to count it is set to 1 then.
  print substr($0,1,32) a[val]"\t\t"substr($0,56) ##Printing sub-string from 1 to 32 chars, value of a[val] TABs then sub-string from 56 char to till last of line.
}' transformer.names  Input_file                  ##Mentioning Input_file(s) name here.

Answer 2

您的数据似乎不是要修改的文本之前的大写字母。
这样您就可以尝试这个awk了。

awk '
FNR==NR {
  a[NR]=$0
  b=length()
  len = len < b ? b : len
  next
}
{
  c = sprintf( "%-*2$s" , a[FNR], (len+1))
  sub(/[A-Z][A-Z ]+/,c)
}
1' transformer_name customer_name

首先，我们将所有转换器名称放入数组a中，并将较大的长度保留为len 在我们用新名称替换所有旧名称之后，调整c中的格式。
您可以根据需要修改（len + 1）。

Answer 3

您需要将%Ns应用于您想填充而不是整行的特定字段，并且需要使减号（对于leftpad / rightalign）成为说明符的一部分，还需要使printf不会像print那样自动添加行/记录分隔符，因此您需要添加：

 printf "%s%-32s%s\n", substr($0, 1, 30), newname, substr($0, 62, 120)
 # note commas; this is a format string containing three specifiers, 
 # and separate three data values used for those three specifiers

或者，您可以填充字段，然后然后连接：

 print substr($0,1,30) sprintf("%-32s", newname) substr($0,62,120) 
 # no commas except within the sprintf (and the substr's)

如果数据文件中的行数多于“ transformernames”文件中的行，那么您需要缓冲名称并反复循环浏览它们，如Ravinder所示。

此外，awk中的substr个位置从1开始；如果您指定0或负数，则将其视为1，但是我认为实际上说出您的意思更清楚，因此我将其修正。在您发布的示例数据中，62不是在客户名称之后的零件的正确起始位置，但是您说数据仅与真实数据“非常相似”，所以我不知道56还是62还是其他。是正确的。

Answer 4

#!/usr/bin/awk -f
BEGIN {
}
{
  getline line < "transformer.names"
  printf("%s %-32s %s \n", substr($0, 0, 30), line, substr($0, 62, 120))
}

您在问题中几乎得到了答案！我只是复制了您的内容，并对其进行了一些修改：）

如何使用AWK在空格处右填充

4 个答案: