所以在linux的命令行中我试图搜索一些HTML代码并只打印代码的动态部分。例如这段代码
<p><span class="RightSideLinks">Tel: 090 97543</span></p>
我只想打印97543而不是090.下次我搜索文件时代码可能已更改为
<p><span class="RightSideLinks">Tel: 081 82827</span></p>
我只想要82827.剩下的代码保持不变只是电话号码改变了。
我可以使用grep执行此操作吗? 感谢
编辑:
是否可以在此代码上使用它?
<tr class="patFuncEntry"><td align="left" class="patFuncMark"><input type="checkbox" name="renew0" id="renew0" value="i1061700" /></td><td align="left" class="patFuncTitle"><label for="renew0"><a href="/record=p1234567~S0"> I just want to print this part. </a></label>
记录号码有哪些变化:p1234567~S0"
以及我要打印的文字。
答案 0 :(得分:1)
使用GNU grep
的一种方式:
grep -oP '(?<=Tel: .{3} )[^<]+' file.txt
file.txt
的示例内容:
<p><span class="RightSideLinks">Tel: 090 97543</span></p>
<p><span class="RightSideLinks">Tel: 081 82827</span></p>
结果:
97543
82827
编辑:
(?<=Tel: .{3} ) ## This is a positive lookbehind assertion, which to be
## interpreted must be used with grep's Perl regexp flag, '-P'.
Tel: .{3} ## So this is what we're actually checking for; the phrase 'Tel: '
## followed by any character exactly three times followed by a
## space. Since we're searching only for numbers you could write
## 'Tel: [0-9]{3} ' instead.
[^<]+ ## Grep's '-o' flag enables us to return exactly what we want,
## rather than the whole line. Therefore this expression will
## return any character except '<' any number of times.
Putting it all together, we're asking grep to return any character except '<'
any number of times if we can find 'Tel: .{3} ' immediately ahead of it. HTH.