Question

我在Google，Bing等上查询了一些我不一定希望世界看到的目录。如何防止它抓取这些页面/目录？另外如何删除以前的条目？

Answer 1

友好网络抓取工具（Google，Bing，Yahoo，Baidu等）将尊重您的robots.txt file。非常有用http://www.robotstxt.org/的一个例子：

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/

当然，如果确实想要限制您的私人内容，那么使用您的网络服务器authentication and authorization tools或restrict access by address可以更好地为您提供服务

Answer 2

大多数搜索引擎在开始抓取您的网站之前首先检查robots.txt文件。如果不希望它抓取某些目录，请在根目录中创建一个robots.txt文件并将其添加到该目录中：

User-agent: *
Disallow: /my_private_dir

如果您想要一个示例robots.txt文件，here is stackoverflow's。