使用sed逐行提取xml标记的多个参数

时间:2019-01-16 17:56:06

标签: linux awk sed grep

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE nmaprun>
<?xml-stylesheet href="file:///usr/bin/../share/nmap/nmap.xsl" type="text/xsl"?>
<taskprogress task="Service scan" time="1547503455" percent="88.24" remaining="2" etc="1547503456"/>
<host starttime="1547503444" endtime="1547503476"><status state="up" reason="arp-response" reason_ttl="0"/>
<address addr="0.0.0.0" addrtype="ipv4"/>
<address addr="08:00:27:7F:02:62" addrtype="mac" vendor="Oracle"/>
<hostnames>
</hostnames>
<ports><port protocol="tcp"><state state="open" reason="syn-ack"/><service product="prod1" version="3.0.2" ostype="Unix" method="probed" conf="10"><cpe>cpe:/a:vsftpd:vsftpd:3.0.2</cpe></service><script id="banner" output="220 (vsFTPd 3.0.2)"/></port>
<port protocol="tcp"><state state="open" reason="syn-ack" reason_ttl="64"/><service product="secure" version="6.6.1p1 Ubuntu 2ubuntu2" extrainfo="Ubuntu Linux; protocol 2.0" ostype="Linux" method="probed" conf="10"><cpe>cpe:/a:openbsd:openssh:6.6.1p1</cpe><cpe>cpe:/o:linux:linux_kernel</cpe></service><script id="banner" output="SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2"/></port>
<port protocol="tcp"><state state="open" reason="syn-ack" reason_ttl="64"/><service product="hello i am here" hostname=" typhoon" method="probed" conf="10"><cpe>cpe:/a:postfix:postfix</cpe></service><script id="banner" output="220 typhoon ESMTP Postfix (Ubuntu)"/></port>
<port protocol="tcp"><state state="open" reason="syn-ack" reason_ttl="64"/><service product="who am i" version="9.9.5-3" extrainfo="Ubuntu Linux" ostype="Linux" method="probed" conf="10"><cpe>cpe:/a:isc:bind:9.9.5-3</cpe><cpe>cpe:/o:linux:linux_kernel</cpe></service></port>
</ports>

我要搜索字符串'state =“ open”',然后打印该行中存在的产品和版本标签的值(如果不存在版本-仅打印产品值)

我使用了以下sed命令:

cat sample.xml | grep 'state="open"' | egrep -o 'product=".*"' | sed -nE 's/^.*product="([^"]*)".*version="([^"]*)".*$/\1, \2/;p' > output.txt

我得到的输出:

prod1, 3.0.2
secure, 6.6.1p1 Ubuntu 2ubuntu2
<port protocol="tcp"><state state="open" reason="syn-ack" reason_ttl="64"/><service product="hello i am here" hostname=" typhoon" method="probed" conf="10"><cpe>cpe:/a:postfix:postfix</cpe></service><script id="banner" output="220 typhoon ESMTP Postfix (Ubuntu)"/></port>
who am i, 9.9.5-3

我想要的输出:

prod1, 3.0.2
secure, 6.6.1p1 Ubuntu 2ubuntu2
hello i am here
who am i, 9.9.5-3

注意-如果不存在版本标签,则它将打印整行。如果有人可以帮助我,我非常感谢。谢谢!

1 个答案:

答案 0 :(得分:0)

只需执行一个awk命令即可​​:

$ awk '/state="open"/{match($0, /product="([^"]*)"/, p); match($0, /version="([^"]*)"/,v); if (p[1]) {printf p[1]; if (v[1]) printf ", " v[1];} print "";}' sample.xml
prod1, 3.0.2
secure, 6.6.1p1 Ubuntu 2ubuntu2
hello i am here
who am i, 9.9.5-3

请注意,如果没有product,即使没有version,也无法从命令字符串的行为中进行复制。 -您可以根据需要进行调整。