您好我需要一个正则表达式来获取来自localdomain的所有链接没有外部网站。 直到现在我有这个,但只返回外页
<%function getPage(strURL)
dim strBody, objXML
set objXML = CreateObject("Msxml2.ServerXMLHTTP.6.0")
objXML.Open "GET", strURL, False
'objXML.setRequestHeader "User-Agent", "ddd" '=== falsify the agent
'objXML.setRequestHeader "Content-Type", "text/html; Charset:ISO-8859-1"
'objXML.setRequestHeader "Content-Type", "text/html; Charset:UTF-8"
objXML.Send
status = objXML.status
if err.number <> 0 or status <> 200 then
if status = 404 then
Response.Write "[EFERROR]Page does not exist (404)."
elseif status >= 401 and status < 402 then
Response.Write "[EFERROR]Access denied (401)."
elseif status >= 500 and status <= 600 then
Response.Write "[EFERROR]500 Internal Server Error on remote site."
else
Response.write "[EFERROR]Server is down or does not exist."
end if
end if
strBody = objXML.responseText
set objXML = nothing
getPage = strBody
'First, create a reg exp object
Dim objRegExp
Set objRegExp = New RegExp
objRegExp.IgnoreCase = True
objRegExp.Global = True
objRegExp.Pattern = "<a\s+href=""http://(.*?)"">\s*((\n|.)+?)\s*</a>"
'Display all of the matches
Dim objMatch
For Each objMatch in objRegExp.Execute(strBody)
Response.Write("http://" & objMatch.SubMatches(0) & "<br>")
Next
end function
getPage("http://www.google.com")
%>
谢谢
答案 0 :(得分:0)
也许说明显了,但如果您在“localdomain.com”中搜索链接不是这样的话
objRegExp.Pattern = "<a\s+href=""http://(.*?)localdomain\.com"">\s*((\n|.)+?)\s*</a>"
修改强>: 正则表达式模式也许可以这样使用传入的URL:
objRegExp.Pattern = "<a\s+href=""" & strURL & "(.*?)"">\s*((\n|.)+?)\s*</a>"
检索到的匹配也需要附加strURL:
For Each objMatch in objRegExp.Execute(strBody)
Response.Write("http://" & strURL & objMatch.SubMatches(0) & "<br>")
Next