Question

我有一个使用curl下载页面的bash脚本，然后使用grep和sed将html块中的javascript解压缩到一个文件，所以在它之后我使用node来评估和使用javascript下载。是这样的：

curl 'http://...' ... | grep -E "(varxpto\(|fnxpto)" | sed 's|<[/]\?script[^>]*>||g' > fn.js  
x="$(node -pe "var fs = require('fs'); eval( fs.readFileSync('fn.js')+'' ); 
var val=fnxpto('${PW}'); val;")"

它就像使用bash的魅力一样。但我需要将它作为服务公开，所以我尝试在nodejs中进行。

我的问题是......怎么办？我试过xpath，但似乎需要xmldoc作为prereq而xmldoc不解析我的html（它认为它是xml独有的，而不是html）。

不是我想要的，但是我也试图将grep / sed作为我的问题的解决方案。

注意：我使用require恢复了html文本（＆＃39; http＆＃39;）我在这里不需要帮助。仅在从html中提取javascript并导入/评估它。

任何人都知道如何从html中提取javascript文本脚本并在节点中对其进行评估？

Answer 1

您可以使用类似cheerio的内容来解析HTML，然后在文档中查询脚本标记：

// `data` is the entire string response from `http.request()`
var cheerio = require('cheerio'),
    $ = cheerio.load(data);

$('script').each(function(i, elem) {
  console.dir($(this).text());
  // do eval() or whatever else here
});

来自html正文的eval javascript文本

1 个答案: