目前我有以下文本文件,即网站转储
[61] Title1 subtitle1 1428 Elm Street, Springwood, Ohio 0812 Phone: (00) 0000 0000 [62] Email [61] Title2 Subtitle2 1428 Elm Street, Springwood, Ohio 0812 Phone: (00) 0000 0000 [65] Email [66] Website 62 mailto: info@yyyyyyyyyy.com 65 mailto: mitchellstccc@xxxxxx.com 66 http://www.website.com
我需要转换csv文件,但将电子邮件替换为电子邮件和网站下面的值(如果有)。
Title1, subtitle1, 1428 Elm Street, Springwood, Ohio 0812, (00) 0000 0000, Email Title2, subtitle2, 1428 Elm Street, Springwood, Ohio 0812, (00) 0000 0000, Email, http://www.website.com
我该如何完成这项任务?
我正在尝试使用awk,但我的awk-fu糟透了。 他们可以帮我一臂之力? (我不喜欢脚本或编程语言)
谢谢!
答案 0 :(得分:1)
我会在2次传球中做到这一点,例如:
$ cat tst.awk
BEGIN {
ARGV[ARGC] = ARGV[ARGC-1]; ARGC++
RS = ""; FS = "\n"
}
NR==FNR {
if (/^[[:digit:]]/) {
for (i=1;i<=NF;i++) {
key = val = $i
sub(/[[:space:]].*/,"",key)
sub(/[^[:space:]]+[[:space:]]+/,"",val)
gsub(/ /,"",val)
map["["key"]"] = val
}
}
next
}
!/^[[:digit:]]/ {
out = ""
for (i=1;i<=NF;i++) {
out = out sprintf("%s", (i>1?",":""))
split($i,arr,/[[:space:]]+/)
for (j=1;j in arr;j++) {
if (arr[j] ~ /^\[.*\]$/) {
if (arr[j] in map) {
arr[j+1] = map[arr[j]]
arr[j] = ","
}
else {
arr[j] = ""
}
}
out = out sprintf("%s%s", (j>1?" ":""), arr[j])
}
}
gsub(/[[:space:]]*,[[:space:]]*/,", ",out)
print out
}
$ awk -f tst.awk file
Title1, subtitle1, 1428 Elm Street, Springwood, Ohio 0812, Phone: (00) 0000 0000, mailto:info@yyyyyyyyyy.com
Title2, Subtitle2, 1428 Elm Street, Springwood, Ohio 0812, Phone: (00) 0000 0000, mailto:mitchellstccc@xxxxxx.com, http://www.website.com
第一遍仅读取数字到电子邮件和站点值的映射,第二遍只处理替换[66] Website
的地址块,并在第一遍中读取66
的值。