Question

我使用了nwjs（版本0.18.8）并且我在mangafox.me上发出请求以进行一个mangareader。

当我尝试对像这样的漫画图像http://mangafox.me/manga/onepunch_man/vTBD/c066/1.html发出请求时，我会得到这些奇怪的符号：

{SF 6 W＃Y \ AI（tYdϯM％9 @ CW〜I（V ںʑytk2zoy。^〜wɌeҲ] CKF = v 0 3？y`Y _̘gY|fY \ Q2 M nV iz g b$W _a c C5

我该如何解决这个问题？

Answer 1

Nevermind x）实际上只是输出是用zip压缩的，所以如果你想解决它，如果你有同样的问题，只需在请求标题中添加gzip：true Ex：

request({url: '*****', gzip: true}, function(err, res, html){

   if (!error && response.statusCode == 200) {

   //Do something

   }

});

Answer 2

你不需要node.js这么简单。抓取网站的最简单方法是将其加载到隐藏的iframe中，然后循环遍历文档所需的元素数组。

加载的文档为您提供了类似这些内容的所有内容......

 Frame.contentWindow.document.forms

 Frame.contentWindow.document.scripts

 Frame.contentWindow.document.styleSheets

 Frame.contentWindow.document.embeds

 Frame.contentWindow.document.cookie

 Frame.contentWindow.document.images

 Frame.contentWindow.document.links

等等......

使用node.js请求刮取网站并获取奇怪的字符

2 个答案: