如何通过键识别js数组?

时间:2015-07-29 08:14:44

标签: regex python-2.7 web-scraping scrapy

我的蜘蛛将javascript代码作为字符串返回。从这段代码我需要检索一个我可以通过其键识别的数组。

这意味着,我已经拥有了密钥但是如何获得完整的数组呢?另外,我不知道数组的名称。

正则表达式适合吗?或者有一个很好的方法来实现这一目标吗? 谢谢!

编辑:

javacode的部分内容看起来像这样(对不起,但将所有内容复制到此处是太多而且不必要):

 {var P=parseInt($(".secondary-results-count").html());if(P-1<1){$(".secondary-results-show").hide()}else{$(".secondary-results-count").html(P-1)}}},hasOffers:function(M){if(M.result.offer&&M.result.offer.offers){return(M.result.offer.offers.length>0)?true:false}return false},queryCompanyInfo:function(O,M,N){new QueryCompanyInfo({companyInfoId:O,bookingId:M},function(Q){if(Q.status=="Ok"){var P=arrayStore.inst("offersId");var R=P.get(M);R.company=Q.result.companyInfo;P.put(M,R);if(N){N(Q,P.get(M))}}}).query()},createOfferHtml:function(O){arrayStore.inst("offersId").put(O.bookingId,{price:O,company:null});var aq={"-2":"Best Value","-3":"Executive","-4":"Minibus","-1":"Other","0":"NotSet","1":"Compact","2":"Sedan","3":"PeopleCarrier","4":"SUV","5":"VanOrMinibus","6":"Coach","7":"StretchLimo","8":"StationWagon","9":"Convertible","102":"SportsCar","104":"Offroad","105":"PickupTruck","106":"Motorcycle","107":"Rickshaw","108":"WaterTaxi"};var Z=12; ...

我知道键“-1”,“ - 2”,“ - 3”。

2 个答案:

答案 0 :(得分:0)

首先,我建议你与scrapy问题无关,

另一方面,您可以使用正则表达式

获取此数据

我尝试了var\s+aq=(.*?);正则表达式,并且它可以很好地解决您的问题

答案 1 :(得分:0)

您可以使用js2xml进行此操作,并使用如下所示的XPath识别对象:

>>> import js2xml 
>>> jstree = js2xml.parse(jsstring) 
>>> objs = jstree.xpath('//object[property[@name = "-1"]]')
>>> # or this alternative
>>> # objs = jstree.xpath('//object[property/@name="-1"]')
>>> print js2xml.jsonlike.make_dict(objs[0])
... {'-1': 'Other',
...  '-2': 'Best Value',
...  '-3': 'Executive',
...  '-4': 'Minibus',
...  '0': 'NotSet',
...  '1': 'Compact',
...  '102': 'SportsCar',
...  '104': 'Offroad',
...  '105': 'PickupTruck',
...  '106': 'Motorcycle',
...  '107': 'Rickshaw',
...  '108': 'WaterTaxi',
...  '2': 'Sedan',
...  '3': 'PeopleCarrier',
...  '4': 'SUV',
...  '5': 'VanOrMinibus',
...  '6': 'Coach',
...  '7': 'StretchLimo',
...  '8': 'StationWagon',
...  '9': 'Convertible'}