Question

我想解析html文件并从<a>标记中提取链接。例如，我尝试从以下<a>标记中提取链接。

<a class="thumb vtop inlblk rel tdnone linkWithHash scale5 detailsLink" href="http://olx.com.pk/item/honda-civic-exi-2005-IDSkzkt.html#6256e9ac30" title=""> <img class="fleft" src="http://img03.olx.com.pk/images_olxpk/89491775_1_144x108_honda-civic-exi-2005-lahore_rev001.jpg" alt="Honda Civic Exi 2005"> </a>

我使用以下正则表达式

private const string _LINK_REGEX = "href=\"[a-zA-Z./:&\\d_-]+\"";

但我无法提取此网址。

Answer 1

您可以使用：

href=\"[^\"]+\"

测试here

解析HTML页面以提取链接

1 个答案: