AWK-进行substr时,printf和sprintf之间存在奇怪的区别

时间:2019-02-13 17:56:19

标签: unix awk

自从我使用AWK已经有一段时间了,但是现在我有一个XML文件,我想在其中增加特定列的ID;因此对于AWK而言,这是一项出色的任务。 从理论上讲,要增加Id,将对行进行解析,您可以将Id提取到变量中,对其进行++编码,然后重新构建行以打印到结果流。 但是,当我使用变量(x = sprintf(...))时,我得到了奇怪的结果,因此我使用了printf来调试它。现在很奇怪的部分:printf恰好转储了正确的ID,但是变量却变得垃圾了,尽管它们具有相同的输入和语法... 当然,一定有些愚蠢,但是我不能指责。

所有ID的格式:

<column name="Id" type="System.Int32">x</column>

这是代码:

#!/bin/ksh
cd /mnt/c/J/D/Work/GIT/Repos/SkillsNG/SkillsNG.WebTests/Snapshots
print -n "Snapshot name: "
read snapshot
defaultId=0
print -n "Start increasing from Id [$defaultId]":
read id

[[ "$id" = '' ]] && id=$defaultId

cat $snapshot | awk -F '>' 'BEGIN {process=0;} {
if (match($0, /SecurityPermissions/))
     {process=1;}
else
  {
    if (!process)
    {
       # just dump all tables up to SecurityPermissions, no processing needed
       print;
       next;
    }
  }
if (match($1, /<column name="Id" type="System.Int32"/))
  {
   if(match($2,/[0-9]*/))
   {
      printf "param 1: %s\n", $1
      printf "param 2: %s\n", $2
      printf "Id value : %s\n", substr($2, 1, index($2,"\<")-1);

      val = sprintf("%s", substr($2,1, index($2,"\<")-1));
      printf "value in variable: %s\n", $val;
      newval = strtonum(val);
      printf "new: %s\n",  $newval

   }
 }
 else print $0;
}' > $snapshot.new
# mv $snapshot $snapshot.old
# mv $snapshot.new $snapshot
cd -

这是一个简单的测试xml:

    <snapshot culture="en-US">
  <table name="[dbo].[SecurityPermissions]">
    <row>
      <column name="Id" type="System.Int32">1</column>
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">1</column>
      <column name="Access" type="System.Int32">1</column>
    </row>
    <row>
      <column name="Id" type="System.Int32">2</column>
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">50</column>
      <column name="Access" type="System.Int32">1</column>
    </row>
    <row>
      <column name="Id" type="System.Int32">3</column>
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">51</column>
      <column name="Access" type="System.Int32">15</column>
    </row>
    <row>
      <column name="Id" type="System.Int32">4</column>
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">52</column>
      <column name="Access" type="System.Int32">3</column>
    </row>
    <row>
      <column name="Id" type="System.Int32">5</column>
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">53</column>
      <column name="Access" type="System.Int32">1</column>
    </row>
    <row>
      <column name="Id" type="System.Int32">6</column>
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">54</column>
      <column name="Access" type="System.Int32">3</column>
    </row>
    <row>
      <column name="Id" type="System.Int32">7</column>
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">56</column>
      <column name="Access" type="System.Int32">1</column>
    </row>
    <row>
      <column name="Id" type="System.Int32">8</column>
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">57</column>
      <column name="Access" type="System.Int32">3</column>
    </row>
    <row>
      <column name="Id" type="System.Int32">9</column>
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">77</column>
      <column name="Access" type="System.Int32">15</column>
    </row>
    <row>
      <column name="Id" type="System.Int32">10</column>
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">78</column>
      <column name="Access" type="System.Int32">15</column>
    </row>
    <row>
      <column name="Id" type="System.Int32">11</column>
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">80</column>
      <column name="Access" type="System.Int32">15</column>
    </row>
  </table>
</snapshot>

结果文件:

<snapshot culture="en-US">
  <table name="[dbo].[SecurityPermissions]">
    <row>
param 1:       <column name="Id" type="System.Int32"
param 2: 1</column
Id value : 1
value in variable:       <column name="Id" type="System.Int32"
new:       <column name="Id" type="System.Int32"
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">1</column>
      <column name="Access" type="System.Int32">1</column>
    </row>
    <row>
param 1:       <column name="Id" type="System.Int32"
param 2: 2</column
Id value : 2
value in variable: 2</column
new: 2</column
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">50</column>
      <column name="Access" type="System.Int32">1</column>
    </row>
    <row>
param 1:       <column name="Id" type="System.Int32"
param 2: 3</column
Id value : 3
value in variable: 
new: 
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">51</column>
      <column name="Access" type="System.Int32">15</column>
    </row>
    <row>
param 1:       <column name="Id" type="System.Int32"
param 2: 4</column
Id value : 4
value in variable: 
new: 
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">52</column>
      <column name="Access" type="System.Int32">3</column>
    </row>
    <row>
param 1:       <column name="Id" type="System.Int32"
param 2: 5</column
Id value : 5
value in variable: 
new: 
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">53</column>
      <column name="Access" type="System.Int32">1</column>
    </row>
    <row>
param 1:       <column name="Id" type="System.Int32"
param 2: 6</column
Id value : 6
value in variable: 
new: 
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">54</column>
      <column name="Access" type="System.Int32">3</column>
    </row>
    <row>
param 1:       <column name="Id" type="System.Int32"
param 2: 7</column
Id value : 7
value in variable: 
new: 
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">56</column>
      <column name="Access" type="System.Int32">1</column>
    </row>
    <row>
param 1:       <column name="Id" type="System.Int32"
param 2: 8</column
Id value : 8
value in variable: 
new: 
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">57</column>
      <column name="Access" type="System.Int32">3</column>
    </row>
    <row>
param 1:       <column name="Id" type="System.Int32"
param 2: 9</column
Id value : 9
value in variable: 
new: 
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">77</column>
      <column name="Access" type="System.Int32">15</column>
    </row>
    <row>
param 1:       <column name="Id" type="System.Int32"
param 2: 10</column
Id value : 10
value in variable: 
new: 
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">78</column>
      <column name="Access" type="System.Int32">15</column>
    </row>
    <row>
param 1:       <column name="Id" type="System.Int32"
param 2: 11</column
Id value : 11
value in variable: 
new: 
      <column name="GroupId" type="System.Int32">1</column>
      <column name="Securable" type="System.Int32">80</column>
      <column name="Access" type="System.Int32">15</column>
    </row>
  </table>
</snapshot>

从结果文件中可以看出,Id值(通过printf)可以正常工作,但是对变量(通过sprintf)的相同构造会产生垃圾。 有人知道发生了什么吗? 提前致谢。 干杯, DJ

1 个答案:

答案 0 :(得分:1)

首先,您不应该尝试使用awk或sed或类似的东西来做这些事情。 XML是一个复杂的数据结构,其所有丑陋之处都与之相关。虽然一个简单的awk现在可能会执行此操作,但它突然就会失败,并且您将不知道是什么原因击中了您。

如果要增加该特定值,可以使用以下xmlstarlet命令:

 $ xmlstarlet ed --update '//table/row/column[@name="Id"]' -x ".+1" test.xml

其内容如下: xmlstarlet 将通过(ed)更新所有与以下内容匹配的节点(test.xml)编辑文件--update。 XPath表达式({'//table/row/column[@name="Id"]' :: :: column的{​​{1}}的所有节点row的子节点,属性table等于name)并更改XPath表达式 Id 的值(增加当前值(-x ".+1"))

要回答您的问题:由于您使用.引用了一些变量,因此使用awk会获得意外结果。示例:

$

在第一行中,您计算​​val = sprintf("%s", substr($2,1, index($2,"\<")-1)); printf "value in variable: %s\n", $val; 的值,但在第二行中,您使用val。后者实际上返回带有数字$val的字段的值。因此,如果val,则val=2将返回$val,即第二个字段的内容。