应用错误收集

url在网站中的前缀

时间：2014-01-26 14:14:31

标签： html

我对状态感到困惑。

我下载了http：// tukaani.org/xz/format.html的页面，并希望关注html中的href，但是，链接就像＆LT; a href =“xz-file-format-1.0.4.txt”＆gt; 指向http：// tukaani.org/xz/xz-file-format-1.0.4.txt而不是http：// tukaani.org/xz-file-format-1.0.4.txt'。

我怎么能得到url的前缀，它不是主机名，或者是base，我在标题中找不到任何有用的信息，标题中没有像'http：// tukaani.org/xz/'这样的字符串。但任何浏览器都知道链接。

内部机制是什么？如何使用wget，curl或perl在'http：// tukaani.org/xz/format.html'中获取前缀'http：// tukaani.org/xz/'？

1 个答案:

答案 0 :(得分：0)

这里发生的是有两种链接：绝对或相对。

你提到的那个是相对的，但相对于什么？答案是：相对于当前页面网址，因此，当您访问http://example.com/xz/format.html时，基本网址为http://example.com/xz/

如果您正在访问http://example.com/xz/another-sublevel/foo.html，则“基数”将为http://example.com/xz/another-sublevel/

您可能已经注意到它的工作方式类似于文件夹/文件结构，因此使用相对链接从当前网址的“文件夹”开始构建网址。

绝对链接将以/（文件夹树结构的所谓“根”）开头，所以如果您有<a href="/xz-file-format-1.0.4.txt">，则在任何页面上都会转到http://example.com/xz-file-format-1.0.4.txt < / p>