I have to process a very large amount of HTML text for epub conversion, and every "automated" solution I found and tried is way less than satisfactory.
So I was thinking toward a regex batch command solution, but I am too regex illiterate to make it work, especially considering possible nesting instances. Can anybody help or point me to a surefire solution?
Thanks in advance!
答案 0 :(得分:0)
最好的解决方案是使用HTML解析器。
对于简单的情况,您可以尝试以下正则表达式:<[abip]>[^<>]*<\/[abip]>|<[abip][^<>]*\/>