我正在寻找一种方法来获取一些源代码。我需要的信息是在类似的标签内。


 < script>
 .......
 var playerIdMap = {};
 playerIdMap ['4'] ='614';
 playerIdMap ['5'] ='84';
 playerIdMap ['6'] ='65'; 
 playerIdMap ['7'] ='701';
 getPlayerIdMap = function(){return playerIdMap; }; // global
}
 enclosePlayerMap();
< / script>



 我正在尝试获取playerIdMap数字的内容,例如:4和614,或整个行。

答案 0 :(得分:1)
修改-2 强>
完整的PHP代码受到How to get data from API - php - curl
代码的启发<?php
/**
* Handles making a cURL request
*
* @param string $url URL to call out to for information.
* @param bool $callDetails Optional condition to allow for extended
* information return including error and getinfo details.
*
* @return array $returnGroup cURL response and optional details.
*/
function makeRequest($url, $callDetails = false)
{
// Set handle
$ch = curl_init($url);
// Set options
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// Execute curl handle add results to data return array.
$result = curl_exec($ch);
$returnGroup = ['curlResult' => $result,];
// If details of curl execution are asked for add them to return group.
if ($callDetails) {
$returnGroup['info'] = curl_getinfo($ch);
$returnGroup['errno'] = curl_errno($ch);
$returnGroup['error'] = curl_error($ch);
}
// Close cURL and return response.
curl_close($ch);
return $returnGroup;
}
$url = "http://www.bullshooterlive.com/my-stats/999/";
$response = makeRequest($url, true);
$re = '/playerIdMap\[\'(?P<id>\d+)\']\s+=\s+\'(?P<value>\d+)\'/';
preg_match_all($re, $response['curlResult'], $matches, PREG_SET_ORDER, 0);
// Print the entire match result
var_dump($matches);
//var_dump($response);
修改-1 强>
抱歉没有意识到你问过PHP问题。不知道为什么我在这里假设scrapy。无论如何在PHP代码下面应该有帮助
$re = '/playerIdMap\[\'(?P<id>\d+)\']\s+=\s+\'(?P<value>\d+)\'/';
$str = '<script>
.......
var playerIdMap = {};
playerIdMap[\'4\'] = \'614\';
playerIdMap[\'5\'] = \'84\';
playerIdMap[\'6\'] = \'65\';
playerIdMap[\'7\'] = \'701\';
getPlayerIdMap = function() { return playerIdMap; }; // global
}
enclosePlayerMap();
</script>';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
// Print the entire match result
var_dump($matches);
上一个回答
您可以使用以下内容
>>> data = """
... <script>
... .......
... var playerIdMap = {};
... playerIdMap['4'] = '614';
... playerIdMap['5'] = '84';
... playerIdMap['6'] = '65';
... playerIdMap['7'] = '701';
... getPlayerIdMap = function() { return playerIdMap; }; // global
... }
... enclosePlayerMap();
... </script>
... """
>>> import re
>>>
>>> regex = r"playerIdMap\['(?P<id>\d+)']\s+=\s+'(?P<value>\d+)'"
>>> re.findall(regex, data)
[('4', '614'), ('5', '84'), ('6', '65'), ('7', '701')]
您需要使用下面的
来访问脚本标记data = response.xpath("//script[contains(text(),'getPlayerIdMap')]").extract_first()
import re
regex = r"playerIdMap\['(?P<id>\d+)']\s+=\s+'(?P<value>\d+)'"
print(re.findall(regex, data))
[('4', '614'), ('5', '84'), ('6', '65'), ('7', '701')]