好的,我们有一个包含以下域名的网络应用程序:
mydomain.com/#article;articleID=1
现在我们有一个servlet过滤器mydomain.com/MyFilter
public class CrawlServlet implements Filter{
@Override
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain) throws IOException, ServletException {
// TODO Auto-generated method stub
HttpServletRequest httpRequest = (HttpServletRequest) request;
String fullURLQueryString = getFullURL(httpRequest);
// here we can read mydomain.com/#article;articleID=1
// if we open this mydomain.com/#article;articleID=1 we can see the article data that was taken from DB
// can we somehow capture that article data?
}
}
我们能够实现这一目标吗?
我想这样做因为我想向Bot Crawler显示数据以索引我的页面。
答案 0 :(得分:0)
是的,使用URL
和URLConnection
:
URL url = new URL(fullURLQueryString);
URLConnection connection = url.openConnection();
InputStream in = connection.getInputStream();
然后从in
读取页面。