如果我理解正确的话,我可以设置nginx来处理爬虫(而不是nodejs)。所以我从快速配置中删除了app.use(require('prerender-node').set('prerenderToken', 'token'))
并进行了以下nginx设置(我不使用prerender令牌):
# Proxy / load balance (if more than one node.js server used) traffic to our node.js instances
upstream my_server_upstream {
server 127.0.0.1:9000;
keepalive 64;
}
server {
listen 80;
server_name test.local.io;
access_log /var/log/nginx/test_access.log;
error_log /var/log/nginx/test_error.log;
root /var/www/client;
# Static content
location ~ ^/(components/|app/|bower_components/|assets/|robots.txt|humans.txt|favicon.ico) {
root /;
try_files /var/www/.tmp$uri /var/www/client$uri =404;
access_log off;
sendfile off;
}
# Route traffic to node.js for specific route: e.g. /socket.io-client
location ~ ^/(api/|user/|en/user/|ru/user/|auth/|socket.io-client/|sitemap.xml) {
proxy_redirect off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Host $http_host;
proxy_set_header X-NginX-Proxy true;
proxy_set_header Connection "";
proxy_http_version 1.1;
proxy_pass_header X-CSRFToken;
sendfile off;
# Tells nginx to use the upstream server
proxy_pass http://my_server_upstream;
}
location / {
root /var/www/client;
index index.html;
try_files $uri @prerender;
access_log off;
sendfile off;
}
location @prerender {
set $prerender 0;
if ($http_user_agent ~* "baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator") {
set $prerender 1;
}
if ($args ~ "_escaped_fragment_") {
set $prerender 1;
}
if ($http_user_agent ~ "Prerender") {
set $prerender 0;
}
#resolve using Google's DNS server to force DNS resolution and prevent caching of IPs
resolver 8.8.8.8;
if ($prerender = 1) {
#setting prerender as a variable forces DNS resolution since nginx caches IPs and doesnt play well with load balancing
set $prerender "127.0.0.1:3000";
rewrite .* /$scheme://$host$request_uri? break;
proxy_pass http://$prerender;
}
if ($prerender = 0) {
rewrite .* /index.html$is_args$args break;
}
}
}
但是当我按curl test.local.io?_escaped_fragment_=
测试时,我得到got 504 in 344ms for http://test.local.io
节点版本是6.9.1。我使用vagrant来设置环境。
答案 0 :(得分:0)
以上配置正常。缺少的只是/etc/hosts
:127.0.0.1 test.local.io