Question

我正在尝试提取位于两组html标记之外的内容。

HTML的设置如下：

<div class="col-md-4 col-sm-6 col-lg-3">
    <small class="text-muted pull-right">4.4</small>
    <i class="custom-icon"></i>
    desired content to retrieve
    <span class="text-muted">some other text here</span>
</div>

我需要检索位于</i>之后和<span class="text-muted">之前的内容“需要检索的内容”。

我试过了：

$custom_regex= '#</i>(.*?)<span class="text-muted">#';

$text_scan = preg_match_all( $custom_regex, $content_to_scan, $text_array );

没有成功。 $text_array变量返回空。

我对正则表达式并不是那么好，所以也许我的表达对于我所追求的不正确。

Answer 1

不会使用外观更好吗？

public class StreamClientHandler extends SimpleChannelInboundHandler<HttpObject> {


    private int metadataInterval = 0;

    @Override
    protected void messageReceived(ChannelHandlerContext handlerContext, HttpObject message) throws Exception {

        if(message instanceof HttpResponse) {
            HttpResponse response = (HttpResponse) message;
            this.metadataInterval = Integer.parseInt(response.headers().get("icy-metaint").toString());
        }

        if(message instanceof HttpContent) {

            HttpContent content = (HttpContent) message;



            if(content instanceof LastHttpContent) {
                // Close connection
                handlerContext.channel().close();
            }
        }
    }
}

演示：https://regex101.com/r/zK2wD8/8

Answer 2

如果您坚持使用正则表达式，请尝试this。

/<\/i>\s*(.*?)\n.*<span class="text-muted"/g

正则表达式内容在两个标签之外

2 个答案: