感谢阅读。
我有一个包含一些简单用户信息的纯文本文件
事情是,有时缺少其中一项。
请注意Norman和Reggie如何显示电子邮件地址,但Missy不会:
Name: Norman Normalrecord
Email: norman@ooga.com
Addr: 123 Main street
Name: Missy Missington
Addr: 789 Back street
Name: Reggie Regularrecord
Email: reggie@booga.com
Addr: 456 Middle street
我想grep / sed并说“如果找不到电子邮件地址,请替换为文本missing_email_addr”,所以我得到了这个结果:
Norman Normalrecord
norman@ooga.com
123 main street
Missy Missington
MISSING_EMAIL_ADDR
789 back street
Reggie Regularrecord
reggie@booga.com
456 middle street
问题是,在我的所有实验中,当没有发现任何东西时,grep / sed绝对没有产生任何东西,所以我甚至不能再进行全局替换。
我梦寐以求的是(显然是伪grep),当搜索找不到任何东西时提供要打印的内容:
grep /Name:/MISSING_NAME/email:/MISSING_EMAIL_ADDR/Addr:/MISSING_STREET_ADDR/
有没有办法做这样的事情?再次感谢。
答案 0 :(得分:2)
这是一个开始。它用“Email:N / A”替换丢失的电子邮件行。
awk -v RS='\n\n' -v FS='\n' -v OFS='\n' \
'{ if (!$3) $3 = "Email: N/A"; print; print "" }' users.txt
输出:
Name: Norman Normalrecord
Email: norman@ooga.com
Addr: 123 Main street
Name: Missy Missington
Addr: 789 Back street
Email: N/A
Name: Reggie Regularrecord
Email: reggie@booga.com
Addr: 456 Middle street
答案 1 :(得分:1)
这可能适合你(GNU sed):
sed '/^Name: /!b;:a;$!N;/\nAddr: /!ba;/\nEmail: /!s/\n/&Email: MISSING_EMAIL_ADDR&/' file
如果您要删除标签:
sed -r '/^Name: /!b;:a;$!N;/\nAddr: /!ba;/\nEmail: /!s/\n/&Email: MISSING_EMAIL_ADDR&/;s/(Name|Email|Addr): //g' file
答案 2 :(得分:1)
将GNU awk用于gensub():
$ cat tst.awk
BEGIN { RS=""; ORS="\n\n"; FS=OFS="\n" }
NF<3 { $3=$2; $2="Email: MISSING_EMAIL_ADDR" }
{ print gensub(/(^|\n)[^:]+:[[:space:]]*/,"\\1","g") }
$ gawk -f tst.awk file
Norman Normalrecord
norman@ooga.com
123 Main street
Missy Missington
MISSING_EMAIL_ADDR
789 Back street
Reggie Regularrecord
reggie@booga.com
456 Middle street
您可以在任何awk中使用sub(/ ^ ..)然后使用gsub(/ \ n ...)代替gensub(/(^ | \ n)...)来执行相同操作。
如果它有用,可以识别任何缺失的字段,并按照输入中使用字段的顺序为其提供“缺失”指示,而无需事先明确指定任何字段(假设每个字段都显示在至少有一条记录):
$ cat tst.awk
BEGIN { RS=""; FS=OFS="\n" }
{
for (fldNr=1; fldNr<=NF; fldNr++) {
split($fldNr,nameVal,/:[[:space:]]*/)
name = nameVal[1]
val = nameVal[2]
rec[NR,name] = val
if (!seen[name]++) {
for (nameNr=++numNames; nameNr>fldNr; nameNr--) {
names[nameNr] = names[nameNr-1]
}
names[nameNr] = name
}
}
}
END {
for (recNr=1; recNr<=NR; recNr++) {
for (nameNr=1; nameNr<=numNames; nameNr++) {
name = names[nameNr]
key = recNr SUBSEP name
if (key in rec) {
print rec[key]
}
else {
print "MISSING_" toupper(name)
}
}
print ""
}
}
$
$ cat file
Name: Norman Normalrecord
Email: norman@ooga.com
Addr: 123 Main street
Name: Missy Missington
Addr: 789 Back street
Name: Reggie Regularrecord
Email: reggie@booga.com
Addr: 456 Middle street
Whatever: Some useful info
$
$ awk -f tst.awk file
Norman Normalrecord
norman@ooga.com
123 Main street
MISSING_WHATEVER
Missy Missington
MISSING_EMAIL
789 Back street
MISSING_WHATEVER
Reggie Regularrecord
reggie@booga.com
456 Middle street
Some useful info
答案 3 :(得分:0)
这是一个sed
脚本,它似乎做了你“梦想”的事情(它假设条目用空行分隔):
$ cat s.sed
# collect the lines from one entry in the pattern space
# removing the empty line for consistency
:a; $!{N;/\n$/!ba}; s/\n$//
# make substitutions
/Name:/!s/^/MISSING_NAME\n/
/Email:/!s/\n/\nMISSING_EMAIL_ADDR\n/
/Addr:/!s/$/\nMISSING_STREET_ADDR/
# add an empty line back
s/$/\n/p
使用您的数据:
$ sed -nf s.sed info.txt
Name: Norman Normalrecord
Email: norman@ooga.com
Addr: 123 Main street
Name: Missy Missington
MISSING_EMAIL_ADDR
Addr: 789 Back street
Name: Reggie Regularrecord
Email: reggie@booga.com
Addr: 456 Middle street
另一个演示:
$ cat info_ext.txt
Email: norman@ooga.com
Addr: 123 Main street
Name: Missy Missington
Addr: 789 Back street
Name: Reggie Regularrecord
Email: reggie@booga.com
$ sed -nf s.sed info_ext.txt
MISSING_NAME
Email: norman@ooga.com
Addr: 123 Main street
Name: Missy Missington
MISSING_EMAIL_ADDR
Addr: 789 Back street
Name: Reggie Regularrecord
Email: reggie@booga.com
MISSING_STREET_ADDR