我想获取CNN评论系统为Disqus的所有评论。 例如,http://edition.cnn.com/2013/02/25/tech/innovation/google-glass-privacy-andrew-keen/index.html?hpt=hp_c1
评论系统要求我们点击“加载更多”,以便我们可以看到更多评论。 我已经尝试使用PHP来解析HTML,但由于使用了javascript,因此无法加载所有注释。 所以我想知道是否有人有更方便的方法来检索特定cnn网址的所有评论。
有没有人成功? 提前致谢
答案 0 :(得分:7)
Disqus API包含使用在JSON响应中返回的游标的分页方法。有关游标的信息,请参阅此处:http://disqus.com/api/docs/cursors/
既然你提到过PHP,那么这样就可以让你开始:
<?php
$apikey = '<your key here>'; // get keys at http://disqus.com/api/ — can be public or secret for this endpoint
$shortname = '<the disqus forum shortname>'; // defined in the var disqus_shortname = '...';
$thread = 'link:<URL of thread>'; // IMPORTANT the URL that you're viewing isn't necessarily the one stored with the thread of comments
//$thread = 'ident:<identifier of thread>'; Use this if 'link:' has no results. Defined in 'var disqus_identifier = '...';
$limit = '100'; // max is 100 for this endpoint. 25 is default
$endpoint = 'https://disqus.com/api/3.0/threads/listPosts.json?api_key='.$apikey.'&forum='.$shortname.'&limit='.$limit.'&cursor='.$cursor;
$j=0;
listcomments($endpoint,$cursor,$j);
function listcomments($endpoint,$cursor,$j) {
// Standard CURL
$session = curl_init($endpoint.$cursor);
curl_setopt($session, CURLOPT_RETURNTRANSFER, 1); // instead of just returning true on success, return the result on success
$data = curl_exec($session);
curl_close($session);
// Decode JSON data
$results = json_decode($data);
if ($results === NULL) die('Error parsing json');
// Comment response
$comments = $results->response;
// Cursor for pagination
$cursor = $results->cursor;
$i=0;
foreach ($comments as $comment) {
$name = $comment->author->name;
$comment = $comment->message;
$created = $comment->createdAt;
// Get more data...
echo "<p>".$name." wrote:<br/>";
echo $comment."<br/>";
echo $created."</p>";
$i++;
}
// cursor through until today
if ($i == 100) {
$cursor = $cursor->next;
$i = 0;
listcomments($endpoint,$cursor);
/* uncomment to only run $j number of iterations
$j++;
if ($j < 10) {
listcomments($endpoint,$cursor,$j);
}*/
}
}
?>
答案 1 :(得分:3)
只是补充:要在找到的任何页面上获取disqus评论的网址,请在网络浏览器控制台中运行此JavaScript代码:
var visit = function () {
var url = document.querySelector('div#disqus_thread iframe').src;
String.prototype.startsWith = function (check) {
return(this.indexOf(check) == 0);
};
if (!url.startsWith('https://')) return url.slice(0, 4) + "s" + url.slice(4);
return url;
}();
由于变量现在在&#39;访问&#39;
console.log(visit);
我帮助您将所有数据挖成UTF-8 json格式,将其保存到.txt中,可以在link找到它。 json格式包含一些变量名称,但您需要的是&#39;数据&#39;变量,这是一个JavaScript数组。
遍历每一个,然后将它们拆分为&#39; x == x&#39;。 &#39; x == x&#39;这样做是为了确保那些发表评论的人的用户ID也被捕获。如果数字格式中没有用户名但名称,则表示该帐户不再有效。
要使用用户ID ,问题是https://disqus.com/users/106222183 106222183 是用户ID
答案 2 :(得分:0)
没有api:
#disqus_thread {
position: relative;
height: 300px;
background-color: #fff;
overflow: hidden;
}
#disqus_thread:after {
content: "";
display: block;
height: 10px;
width: 100%;
position: absolute;
bottom: 0;
background: white;
}
#disqus_thread.loaded {
height: auto;
}
#disqus_thread.loaded:after{
height:55px;
}
#disqus-load {
text-align: center;
color: #fff;
padding: 11px 14px;
font-size: 13px;
font-weight: 500;
display: block;
text-align: center;
border: none;
background: rgba(29,47,58,.6);
line-height: 1.1;
border-radius: 3px;
font-weight: 500;
transition: background .2s;
text-shadow: none;
cursor:pointer;
}
<div class="disqus-comments">
<div id='disqus_thread'></div>
<div id='disqus-load'>Load comments</div>
</div>
<script type="text/javascript">
$(document).ready(function() {
var disqus_shortname = 'testare-123';
(function() {
var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
})();
$('#disqus-load').on('click', function(){
$.ajax({
type: "GET",
url: "http://" + disqus_shortname + ".disqus.com/embed.js",
dataType: "script",
cache: true
});
$(this).fadeOut();
$('#disqus_thread').addClass('loaded');
});
});
/* * * CONFIGURATION VARIABLES * * */
// var disqus_shortname = 'testare-123';
// /* * * DON'T EDIT BELOW THIS LINE * * */
// (function() {
// var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
// dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
// (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
// })();
</script>
<noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript" rel="nofollow">comments powered by Disqus.</a></noscript>