输入是:
<h1>This is heading 1</h1>
<h2>This is heading 2</h2>
<h3>This is heading 3</h3>
<h4>This is heading 4</h4>
<h5>This is heading 5</h5>
<h6>This is heading 6</h6>
</body>
</html>
预期产出:
This is heading 1
This is heading 2
This is heading 3
This is heading 4
This is heading 5
This is heading 6
我试过sed -n 's/<[^>].*>//gp' example.html
但屏幕上什么都没有,似乎正则表达式不正确
答案 0 :(得分:0)
grep
选项, -P
应该足够了。
$ grep -oP '(?<=>)(.[^<]+)(?=<)' file
This is heading 1
This is heading 2
This is heading 3
This is heading 4
This is heading 5
This is heading 6
答案 1 :(得分:-1)
sed -n 's/<[^>]*>//gp' test.csv | sed '/^$/d'
你几乎就在那里,你使用的点(。)可以匹配&#34;&gt;&#34;字符,所以从命令中删除它
管道后的命令是清除所有空行
答案 2 :(得分:-1)
处理您的样本
sed -n 's|</\{0,1\}h[0-9]>||gp' YourFile
替换任何和在线,如果有修改,请打印
行更准确(假设标签
sed -n 's|^[[:space:]]*<\(h[0-9]>\)\(.*\)</\1|\2|p' YourFile