我想阅读weblogic.xml并提取上下文根信息。这是一个例子:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE weblogic-web-app PUBLIC "-//BEA Systems, Inc.//DTD Web Application 8.1//EN" "http://www.bea.com/servers/wls810/dtd/weblogic810-web-jar.dtd">
<weblogic-web-app>
<context-root>
/XYZ
</context-root>
</weblogic-web-app>
我尝试过以下命令
sed -n '/context-root/{s/.*<context-root>//;s/<\/context-root.*//;p;}' weblogic.xml
awk -F "[><]" '/context-root/{print $3}' weblogic.xml
perl -ne 'if (/context-root/){ s/.*?>//; s/<.*//;print;}' weblogic.xml
如果标签是这样的话,它工作正常:
<context-root>/XYZ</context-root>
如何从xml以上提取标签的值?
答案 0 :(得分:0)
awk '{ gsub(/^[ \t]+|[ \t\r]+$/, ""); } /<\/context-root>/ { p = 0 }; p; /<context-root>/ { p = 1 }' file
输出:
/XYZ
#!/usr/bin/awk -f
{
gsub(/^[ \t]+|[ \t\r]+$/, "")
}
match($0, /^[^<]*<\/context-root>/) {
if (p) {
t = substr($0, 1, index($0, "</context-root>") - 1)
if (length(t)) print t
}
$0 = substr($0, RSTART + RLENGTH)
p = 0
}
{
while (match($0, /<context-root>[^<]*<\/context-root>/)) {
t = substr($0, RSTART, RLENGTH)
gsub(/<\/?context-root>/, "", t)
print t
$0 = substr($0, RSTART + RLENGTH)
}
}
p
match($0, /<context-root>/) {
t = substr($0, RSTART + RLENGTH)
if (length(t)) print t
p = 1
}
另一个版本:
#!/usr/bin/awk -f
function strip(t) {
gsub(/^[ \t]+|[ \t\r]+$/, "", t)
return t
}
match($0, /^[^<]*<\/context-root>/) {
if (p) {
t = strip(substr($0, 1, index($0, "</context-root>") - 1))
if (length(t)) print t
}
$0 = substr($0, RSTART + RLENGTH)
p = 0
}
{
while (match($0, /<context-root>[^<]*<\/context-root>/)) {
t = substr($0, RSTART, RLENGTH)
gsub(/<\/?context-root>/, "", t)
if (length(t)) print t
$0 = substr($0, RSTART + RLENGTH)
}
}
p {
print strip($0)
}
match($0, /<context-root>/) {
t = strip(substr($0, RSTART + RLENGTH))
if (length(t)) print t
p = 1
}
输入:
<context-root>
A B
</context-root>
<context-root>C D</context-root><context-root>E F</context-root><context-root>G H
I J</context-root>
输出:
A B
C D
E F
G H
I J