Awk:根据当前列值创建列

时间:2016-03-25 14:10:42

标签: awk columnsorting

需要一些awk-fu!我有以下数据结构:

RewriteEngine On
# Some hosts require RewriteBase to make RewriteRules work.
RewriteBase /
RewriteRule ^([^&]*)&(.*)$ https://example.com/$1?$2 [L,QSA,R=301]
RewriteCond %{HTTP:X-Forwarded-Proto} !https
RewriteRule !/status https://%{SERVER_NAME}%{REQUEST_URI} [L,R]
ErrorDocument 404 /error.php

# Google SEO workaround for search.php highlights:
# Make this rule the first rewrite rule in your .htaccess!
RewriteRule ^([^&]*)&(.*)$ https://example.com/$1?$2 [L,QSA,R=301]

# Google SEO 404:
ErrorDocument 404 /misc.php?google_seo_error=404

# Google SEO Sitemap:
RewriteRule ^sitemap\-([^./]+)\.xml$ misc.php?google_seo_sitemap=$1 [L,QSA,NC]

# Google SEO URL Forums:
RewriteRule ^Forum\-([^./]+)$ forumdisplay.php?google_seo_forum=$1 [L,QSA,NC]

# Google SEO URL Threads:
RewriteRule ^Thread\-([^./]+)$ showthread.php?google_seo_thread=$1 [L,QSA,NC]

# Google SEO URL Announcements:
RewriteRule ^Announcement\-([^./]+)$ announcements.php?google_seo_announcement=$1 [L,QSA,NC]

# Google SEO URL Users:
RewriteRule ^User\-([^./]+)$ member.php?action=profile&google_seo_user=$1 [L,QSA,NC]

# Google SEO URL Calendars:
RewriteRule ^Calendar\-([^./]+)$ calendar.php?google_seo_calendar=$1 [L,QSA,NC]

# Google SEO URL Events:
RewriteRule ^Event\-([^./]+)$ calendar.php?action=event&google_seo_event=$1 [L,QSA,NC]

每行应该有13列。从第2组开始,我需要评估价值。值应始终为列号减1.换句话说,第2列的值应为1.我需要创建列(通过插入逗号),以便每行具有适当数量的列,并且现有值正确排序(col编号减1,偏移量为第1条记录)。

更正数据的示例:

ybcxl,05,06,07,08,09,10,11  
yxxu,01  
yxxu,03,05,06,07,08,09,10,11  
ybban,01,03,04,05,06,07,08  
zxvhu,01,02,03,04,05,06,07,08,09,10,11,12  

1 个答案:

答案 0 :(得分:0)

$ cat af.txt
ybcxl,05,06,07,08,09,10,11
yxxu,01
yxxu,03,05,06,07,08,09,10,11
ybban,01,03,04,05,06,07,08
zxvhu,01,02,03,04,05,06,07,08,09,10,11,12

$ cat af.awk
BEGIN { FS="," }
{
    printf "%s", $1
    for (i=2; i<=NF; ++i) a[$i+0] = $i
    for (i=1; i<=12; ++i) printf ",%s", (i in a ? a[i] : "")
    print ""
    delete a
}

$ awk -f af.awk af.txt
ybcxl,,,,,05,06,07,08,09,10,11,
yxxu,01,,,,,,,,,,,
yxxu,,,03,,05,06,07,08,09,10,11,
ybban,01,,03,04,05,06,07,08,,,,
zxvhu,01,02,03,04,05,06,07,08,09,10,11,12

测试以确保所有行都有13列:

$ awk -f af.awk af.txt | awk -F, '{ print NF }'
13
13
13
13
13

测试以确保列号比其对应的字段值(如果存在)多一个:

$ awk -f af.awk af.txt | awk -F, '{ error = 0; for(i=2;i<=13;++i) { if ($i && i != ($i + 1)) error = 1 } print (error ? "ERROR!" : "GOOD!")}'
GOOD!
GOOD!
GOOD!
GOOD!
GOOD!

说明:

BEGIN {FS =&#34;,&#34; }
将字段分隔符设置为逗号

printf&#34;%s&#34;,$ 1
输出第一个字段(使用printf以避免换行)

for(i = 2; i&lt; = NF; ++ i)a [$ i + 0] = $ i
使用数字将记录中存在的所有数字存储到数组中 等价物作为索引($i+0将字符串转换为数字)

for(i = 1; i&lt; = 12; ++ i)printf&#34;,%s&#34;,(我在?a [i]:&#34;&#34 ;)
对于我们想要输出的每个字段,如果数字在数组中打印出来, 否则,只需打印逗号

打印&#34;&#34;
打印换行符以完成输出记录

删除
在读取下一个输入记录之前清除数组