如何从脚本标记中提取信息

时间:2016-04-05 06:27:39

标签: php beautifulsoup html-parsing

在我的代码中经过几次查询后,我得到了这样的可变内容:

foreach($_POST['institution'] as $institution){}

我要提取的重要信息是"此站点的所有数字" line:" 48,61,95,106,970,NR8" (紧靠",1," #ffffff")。

我尝试使用python代码:

<!DOCTYPE html>
<html dir=ltr>
  <head>
    <script>
      mapslite = {
        START_TIME: new Date()
      };
      mapslite.getBasePageResponse = function(cacheResponse) {
        delete mapslite.getBasePageResponse;
        cacheResponse([[[3988.776886432477,103.7950744,1.3090672],[0,0,0],[1024,768],13.10000038146973],"/maps-lite/js/2/maps_lite_20160404_RC01",107,null,null,["en",""],["/maps/lite/ApplicationService.GetEntityDetails","/maps/lite/ApplicationService.UpdateStarring","/maps/lite/ApplicationService.Search",null,"/maps/lite/suggest","/maps/lite/directions","/maps/lite/MapsLiteService.GetHotelAvailability",null,"https://www.google.com/maps/api/js/.....
,[null,null,1.3090672,103.7950744],null,"11401",null,"PjoDV_jjE8yPuATo_LmYDA","Asia/Singapore",[["\u003cb\u003eBuses\u003c/b\u003e from this station",[[3,"bus.png",null,"Bus",[["https://maps.gstatic.com/mapfiles/transit/iw2/b/bus.png",0,[15,15],null,0]]]],[[null,null,null,null,"0x31da18325b415901:0xeb661015c651c24a",[[5,["48",1,"#ffffff"]]]],[null,null,null,null,"0x31da19f34e04d59b:0x5758ef6990938b",[[5,["61",1,"#ffffff"]]]],[null,null,null,null,"0x31da1a5b8b75c379:0x6a13e189555f9fab",[[5,["95",1,"#ffffff"]]]],[null,null,null,null,"0x31da1a16ea23bf95:0xd7c90f15535c2b9f",[[5,["106",1,"#ffffff"]]]],[null,null,null,null,"0x31da10a7613d616f:0xf1f61ffeac2ea8a4",[[5,["970",1,"#ffffff"]]]],[null,null,null,null,"0x31da1a0bd6262d0b:0xfbd5d2bfd7a1252",[[5,["NR8",1,"#ffffff"]]]]],null,0,"5"]]],["http://www.google.com/search?q=
....
[0,0,"",0,1,null,null,null,0,0,1,1,0,"map,common",null,0,0,1,null,null,1,"1","2,1","","",0],null,null,"PjoDV_jjE8yPuATo_LmYDA",null,null,null,null,"//consent.google.com","2.maps_lite_20160404_RC01"]);
      };
      executeOgJs = function() {

        delete executeOgJs;
      };
    </script>

但是遇到了一些错误和困难。有没有办法在PHP中方便地做到这一点?

1 个答案:

答案 0 :(得分:0)

我相信你最好的选择是使用正则表达式,如果你确定你总是拥有那种特定的结构。

匹配&#39;数组的表达式&#39; [&#34; 5N4&#34;,323,&#34; #asdasd&#34;]是 12 12 15 15

您可以在PHP中使用(\[\"[a-zA-Z0-9]*?\"\,\d*?\,\".*?\"\])或在python中使用explode()来获取您想要的数字(在本例中为5N4),如下所示:

split()