我需要在网页上获取一些结果,该网页使用一些JavaScript代码生成我感兴趣的部分,如下所示
eval(function(p,a,c,k,e,d){e=function(c){return c};if(!''.replace(/^/,String)){while(c--)d[c]=k[c]||c;k=[function(e){return d[e]}];e=function(){return'\\w+'};c=1;};while(c--)if(k[c])p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c]);return p;}('5 11=17;5 12=["/3/2/1/0/13.4","/3/2/1/0/15.4","/3/2/1/0/14.4","/3/2/1/0/7.4","/3/2/1/0/6.4","/3/2/1/0/8.4","/3/2/1/0/10.4","/3/2/1/0/9.4","/3/2/1/0/23.4","/3/2/1/0/22.4","/3/2/1/0/24.4","/3/2/1/0/26.4","/3/2/1/0/25.4","/3/2/1/0/18.4","/3/2/1/0/16.4","/3/2/1/0/19.4","/3/2/1/0/21.4"];5 20=0;',10,27,'40769|54|Images|Files|png|var|imanhua_005_140430179|imanhua_004_140430179|imanhua_006_140430226|imanhua_008_140430242|imanhua_007_140430226|len|pic|imanhua_001_140429664|imanhua_003_140430117|imanhua_002_140430070|imanhua_015_140430414||imanhua_014_140430382|imanhua_016_140430414|sid|imanhua_017_140430429|imanhua_010_140430289|imanhua_009_140430242|imanhua_011_140430367|imanhua_013_140430382|imanhua_012_140430367'.split('|'),0,{}))
eval()
的结果对我很有价值,我正在编写一个Python脚本,是否有可用于虚拟运行这段JavaScript代码并获取输出的库?
由于
答案 0 :(得分:9)
pyv8是V8 JavaScript引擎(Google Chrome)的一组绑定
答案 1 :(得分:6)
from spidermonkey import Runtime
rt = Runtime()
cx = rt.new_context()
result = cx.eval_script(whatyoupostedabove)
答案 2 :(得分:4)
您可以将PyQt与WebKit模块一起使用:)它具有JS引擎,可以在(X)HTML文档的上下文中评估JS。
答案 3 :(得分:2)
我想你现在已经解决了这个问题,但我想分享另一个(在我看来更可行)选项。当你有兴趣只评估一个--known - javascript函数时,在Python中实现这个函数可能更容易,而不是引入一个巨大的工具来构建解析和运行世界上所有可以想象的javascript。
所以我建议编写一个javascript unpacker函数的python版本,大多数都解决了。事实上我这样做了,这是一个例子。 int2base
函数是Alex Martelli的实现,可以找到here。
def unpack(p, a, c, k, e=None, d=None):
''' unpack
Unpacker for the popular Javascript compression algorithm.
@param p template code
@param a radix for variables in p
@param c number of variables in p
@param k list of c variable substitutions
@param e not used
@param d not used
@return p decompressed string
'''
# Paul Koppen, 2011
for i in xrange(c-1,-1,-1):
if k[i]:
p = re.sub('\\b'+int2base(i,a)+'\\b', k[i], p)
return p
最后,您需要进行一些解析来提取四个函数参数。仅仅为了简单的说明,我在这里使用eval
让Python为我做这件事。
s = '''eval(function(p,a,c,k,e,d){e=function(c){return c};if(!''.replace(/^/,String)){while(c--)d[c]=k[c]||c;k=[function(e){return d[e]}];e=function(){return'\\w+'};c=1;};while(c--)if(k[c])p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c]);return p;}('5 11=17;5 12=["/3/2/1/0/13.4","/3/2/1/0/15.4","/3/2/1/0/14.4","/3/2/1/0/7.4","/3/2/1/0/6.4","/3/2/1/0/8.4","/3/2/1/0/10.4","/3/2/1/0/9.4","/3/2/1/0/23.4","/3/2/1/0/22.4","/3/2/1/0/24.4","/3/2/1/0/26.4","/3/2/1/0/25.4","/3/2/1/0/18.4","/3/2/1/0/16.4","/3/2/1/0/19.4","/3/2/1/0/21.4"];5 20=0;',10,27,'40769|54|Images|Files|png|var|imanhua_005_140430179|imanhua_004_140430179|imanhua_006_140430226|imanhua_008_140430242|imanhua_007_140430226|len|pic|imanhua_001_140429664|imanhua_003_140430117|imanhua_002_140430070|imanhua_015_140430414||imanhua_014_140430382|imanhua_016_140430414|sid|imanhua_017_140430429|imanhua_010_140430289|imanhua_009_140430242|imanhua_011_140430367|imanhua_013_140430382|imanhua_012_140430367'.split('|'),0,{}))'''
js = eval('unpack' + s[s.find('}(')+1:-1])
结果:
'var len=17;var pic=["/Files/Images/54/40769/imanhua_001_140429664.png","/Files/Images/54/40769/imanhua_002_140430070.png","/Files/Images/54/40769/imanhua_003_140430117.png","/Files/Images/54/40769/imanhua_004_140430179.png","/Files/Images/54/40769/imanhua_005_140430179.png","/Files/Images/54/40769/imanhua_006_140430226.png","/Files/Images/54/40769/imanhua_007_140430226.png","/Files/Images/54/40769/imanhua_008_140430242.png","/Files/Images/54/40769/imanhua_009_140430242.png","/Files/Images/54/40769/imanhua_010_140430289.png","/Files/Images/54/40769/imanhua_011_140430367.png","/Files/Images/54/40769/imanhua_012_140430367.png","/Files/Images/54/40769/imanhua_013_140430382.png","/Files/Images/54/40769/imanhua_014_140430382.png","/Files/Images/54/40769/imanhua_015_140430414.png","/Files/Images/54/40769/imanhua_016_140430414.png","/Files/Images/54/40769/imanhua_017_140430429.png"];var sid=40769;'
附加说明:引起我的注意,如果基数> 36然后Alex'int2base
函数中断。解决方案是通过添加大写字符来修改它:digs = string.digits + string.lowercase + string.uppercase
答案 4 :(得分:1)
答案 5 :(得分:0)
当导入javacript模块不是选项时,我使用这个
import re
def baseN(num,b,numerals="0123456789abcdefghijklmnopqrstuvwxyz"):
return ((num == 0) and numerals[0]) or (baseN(num // b, b, numerals).lstrip(numerals[0]) + numerals[num % b])
def unpack(p, a, c, k, e=None, d=None):
while (c):
c-=1
if (k[c]):
p = re.sub("\\b" + baseN(c, a) + "\\b", k[c], p)
return p
encrypted = r'''eval(function(p,a,c,k,e,d){e=function(c){return c};if(!''.replace(/^/,String)){while(c--)d[c]=k[c]||c;k=[function(e){return d[e]}];e=function(){return'\\w+'};c=1;};while(c--)if(k[c])p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c]);return p;}('5 11=17;5 12=["/3/2/1/0/13.4","/3/2/1/0/15.4","/3/2/1/0/14.4","/3/2/1/0/7.4","/3/2/1/0/6.4","/3/2/1/0/8.4","/3/2/1/0/10.4","/3/2/1/0/9.4","/3/2/1/0/23.4","/3/2/1/0/22.4","/3/2/1/0/24.4","/3/2/1/0/26.4","/3/2/1/0/25.4","/3/2/1/0/18.4","/3/2/1/0/16.4","/3/2/1/0/19.4","/3/2/1/0/21.4"];5 20=0;',10,27,'40769|54|Images|Files|png|var|imanhua_005_140430179|imanhua_004_140430179|imanhua_006_140430226|imanhua_008_140430242|imanhua_007_140430226|len|pic|imanhua_001_140429664|imanhua_003_140430117|imanhua_002_140430070|imanhua_015_140430414||imanhua_014_140430382|imanhua_016_140430414|sid|imanhua_017_140430429|imanhua_010_140430289|imanhua_009_140430242|imanhua_011_140430367|imanhua_013_140430382|imanhua_012_140430367'.split('|'),0,{}))'''
encrypted = encrypted.split('}(')[1][:-1]
print eval('unpack(' + encrypted)
输出:
var len=17;var pic=["/Files/Images/54/40769/imanhua_001_140429664.png","/Files/Images/54/40769/imanhua_002_140430070.png","/Files/Images/54/40769/imanhua_003_140430117.png","/Files/Images/54/40769/imanhua_004_140430179.png","/Files/Images/54/40769/imanhua_005_140430179.png","/Files/Images/54/40769/imanhua_006_140430226.png","/Files/Images/54/40769/imanhua_007_140430226.png","/Files/Images/54/40769/imanhua_008_140430242.png","/Files/Images/54/40769/imanhua_009_140430242.png","/Files/Images/54/40769/imanhua_010_140430289.png","/Files/Images/54/40769/imanhua_011_140430367.png","/Files/Images/54/40769/imanhua_012_140430367.png","/Files/Images/54/40769/imanhua_013_140430382.png","/Files/Images/54/40769/imanhua_014_140430382.png","/Files/Images/54/40769/imanhua_015_140430414.png","/Files/Images/54/40769/imanhua_016_140430414.png","/Files/Images/54/40769/imanhua_017_140430429.png"];var sid=40769;