我用bash编写了一个小脚本,它读取了一些HTML并应打印链接的href:
#!/bin/bash
link=$(echo $source | sed -ne 's#^.*<a href="\([^"]*\)".*$#\1#p')
if [ "$(echo "$link" | grep '/fonts/list/style')" ]
then
echo "http://www.domain.com$link/10000"
fi
var源代码在我的例子中:
<li><span>19</span><a href="/fonts/list/style/home words">linktext</a></li>
问题:脚本打印不
http://www.domain.com/fonts/list/style/home words/1000
而不是打印
http://www.domain.com/fonts/list/style/home
words/1000
如何删除或避免此换行?
答案 0 :(得分:0)
您必须逃避"
中出现的<li>...
:
这对我有用:
#!/bin/bash
source="<li><span>19</span><a href=\"/fonts/list/style/home words\">linktext</a></li>"
link=$(echo $source | sed -ne 's#^.*<a href="\([^"]*\)".*$#\1#p')
if [ "$(echo "$link" | grep '/fonts/list/style')" ]
then
echo "http://www.domain.com$link/10000"
fi
输出
http://www.domain.com/fonts/list/style/home words/10000