我正在使用WordPress。其中一个文件functions.php
包含阻止Google抓取的function do_robots() {...
。我用以下内容替换了这个函数:
function do_robots() {
header( 'Content-Type: text/plain; charset=utf-8' );
do_action( 'do_robotstxt' );
if ( '0' == get_option( 'blog_public' ) ) {
echo "User-agent: *";
echo "\nDisallow: /wp-admin";
echo "\nDisallow: /wp-includes";
echo "\nDisallow: /wp-content";
echo "\nDisallow: /stylesheets";
echo "\nDisallow: /_db_backups";
echo "\nDisallow: /cgi";
echo "\nDisallow: /store";
echo "\nDisallow: /wp-includes\n";
} else {
echo "User-agent: *";
echo "\nDisallow: /wp-admin";
echo "\nDisallow: /wp-includes";
echo "\nDisallow: /wp-content";
echo "\nDisallow: /stylesheets";
echo "\nDisallow: /_db_backups";
echo "\nDisallow: /cgi";
echo "\nDisallow: /store";
echo "\nDisallow: /wp-includes\n";
}
}
Allow
不太确定。只要我没有Disallow
,默认情况下它是Allow
吗? function
答案 0 :(得分:1)
SVN中的original function看起来比上面的示例阻止了更少的路径,所以我建议删除一些额外的目录(例如wp-content)并查看这是否是你要找的内容。您还可以尝试使用WordPress plugin为其引擎生成Google Sitemap以供阅读。
function do_robots() {
header( 'Content-Type: text/plain; charset=utf-8' );
do_action( 'do_robotstxt' );
$output = "User-agent: *\n";
$public = get_option( 'blog_public' );
if ( '0' == $public ) {
$output .= "Disallow: /\n";
} else {
$site_url = parse_url( site_url() );
$path = ( !empty( $site_url['path'] ) ) ? $site_url['path'] : '';
$output .= "Disallow: $path/wp-admin/\n";
$output .= "Disallow: $path/wp-includes/\n";
}
echo apply_filters('robots_txt', $output, $public);
}
robots.txt
文件的规则是除非另有说明,否则一切都是允许的,尽管遵守robots.txt
的搜索引擎更像是一个信任系统。