我正在使用搜索API并使用nextpagetoken对结果进行分页。 但是我无法以这种方式检索所有结果。我只能从大约455000个结果中获得500个结果。
以下是获取搜索结果的java代码:
youtube = new YouTube.Builder(Auth.HTTP_TRANSPORT, Auth.JSON_FACTORY, new HttpRequestInitializer() {public void initialize(HttpRequest request) throws IOException {} }).setApplicationName("youtube-search").build();
YouTube.Search.List search = youtube.search().list("id,snippet");
String apiKey = properties.getProperty("youtube.apikey");
search.setKey(apiKey);
search.setType("video");
search.setMaxResults(50);
search.setQ(queryTerm);
boolean allResultsRead = false;
while (! allResultsRead){
SearchListResponse searchResponse = search.execute();
System.out.println("Printed " + searchResponse.getPageInfo().getResultsPerPage() + " out of " + searchResponse.getPageInfo().getTotalResults() + ". Current page token: " + search.getPageToken() + "Next page token: " + searchResponse.getNextPageToken() + ". Prev page token" + searchResponse.getPrevPageToken());
if (searchResponse.getNextPageToken() == null)
{
allResultsRead = true;
search = youtube.search().list("id,snippet");
search.setKey(apiKey);
search.setType("video");
search.setMaxResults(50);
}
else
{
search.setPageToken(searchResponse.getNextPageToken());
}}
输出
Printed 50 out of 455085. Current page token: null Next page token: CDIQAA. Prev page token null
Printed 50 out of 454983. Current page token: CDIQAA Next page token: CGQQAA. Prev page token CDIQAQ
Printed 50 out of 455081. Current page token: CGQQAA Next page token: CJYBEAA. Prev page token CGQQAQ
Printed 50 out of 454981. Current page token: CJYBEAA Next page token: CMgBEAA. Prev page token CJYBEAE
Printed 50 out of 455081. Current page token: CMgBEAA Next page token: CPoBEAA. Prev page token CMgBEAE
Printed 50 out of 454981. Current page token: CPoBEAA Next page token: CKwCEAA. Prev page token CPoBEAE
Printed 50 out of 455081. Current page token: CKwCEAA Next page token: CN4CEAA. Prev page token CKwCEAE
Printed 50 out of 454980. Current page token: CN4CEAA Next page token: CJADEAA. Prev page token CN4CEAE
Printed 50 out of 455081. Current page token: CJADEAA Next page token: CMIDEAA. Prev page token CJADEAE
Printed 50 out of 455081. Current page token: CMIDEAA Next page token: null. Prev page token CMIDEAE
通过while循环进行10次迭代后,它会退出,因为下一页标记为空。
我是Yotube API的新手,不知道我在这里做错了什么。我有两个问题: 1.我如何获得所有结果? 2.为什么第3页的上一页标记与第2页的当前标记不同?
任何帮助将不胜感激。谢谢!
答案 0 :(得分:21)
你正在经历着预期的事情;使用nextPageToken,您最多只能获得500个结果。如果您对如何开发感兴趣,可以通读这个帖子:
https://code.google.com/p/gdata-issues/issues/detail?id=4282
但作为该主题的摘要,它基本上归结为这样一个事实,即在YouTube上有如此多的数据,搜索算法与大多数人认为的完全不同。这不仅仅是对字段中的内容进行简单的数据库搜索,而且还有大量的信号正在被处理以使结果相关,并且在大约500个结果之后,算法开始失去使结果值得的能力。
有一件事让我想到了这一点,就是要意识到当YouTube谈论搜索时,他们谈论的是概率而不是匹配,所以根据你的参数,结果是按照它们的可能性排序的。与您的查询相关。当你分页时,你最终会达到这样的程度,从统计学上讲,相关概率足够低,以至于允许这些结果回归计算是不值得的。所以500是决定的限制。
(另请注意,“结果”的数量不是匹配的近似值,它是潜在匹配的近似值,但是当您开始检索它们时,许多可能的匹配被抛弃而根本不相关......所以这个数字并不能真正意味着人们的想法。谷歌搜索也是一样的。)
你可能想知道为什么YouTube搜索以这种方式运行而不是做更传统的字符串/数据匹配;如果搜索量如此之多,如果他们实际上对每个查询的所有数据进行了完整搜索,那么如果不是更多的话,你每次都要等待几分钟。如果你考虑一下,它真的是一个技术奇迹,当算法在预测,概率等方面发挥作用时,算法如何能够获得前500个案例的相关结果。
关于你的第二个问题,页面标记不代表一组独特的结果,而是代表一种算法状态,因此是指向查询的指针,查询的进度和查询的方向...所以迭代3,例如,迭代2的nextPageToken和迭代4的prevPageToken都引用,但是这两个标记略有不同,因此它们可以指示它们来自的方向。
答案 1 :(得分:4)
我明白了,你还没有包括" nextPageToken"在setFields。
例如:
public class ABC {
private YouTube youtube;
private YouTube.Search.List query;
public static final String KEY = "YOUR API KEY";
public YoutubeConnector(Context context) {
youtube = new YouTube.Builder(new NetHttpTransport(), new JacksonFactory(), new HttpRequestInitializer() {
@Override
public void initialize(HttpRequest httpRequest) throws IOException {
}
}).setApplicationName(context.getString(R.string.app_name)).build();
try {
query = youtube.search().list("id,snippet");
query.setMaxResults(Long.parseLong("10"));
query.setKey(KEY);
query.setType("video");
query.setFields("items(id/videoId,snippet/title,snippet/description,snippet/thumbnails/default/url),nextPageToken");
} catch (IOException e) {
Log.d("YC", "Could not initialize: " + e.getMessage());
}
}
public List<VideoItem> search(String keywords) {
query.setQ(keywords);
try {
List<VideoItem> items = new ArrayList<VideoItem>();
String nextToken = "";
int i = 0;
do {
query.setPageToken(nextToken);
SearchListResponse response = query.execute();
List<SearchResult> results = response.getItems();
for (SearchResult result : results) {
VideoItem item = new VideoItem();
item.setTitle(result.getSnippet().getTitle());
item.setDescription(result.getSnippet().getDescription());
item.setThumbnailURL(result.getSnippet().getThumbnails().getDefault().getUrl());
item.setId(result.getId().getVideoId());
items.add(item);
}
nextToken = response.getNextPageToken();
i ++;
System.out.println("nextToken : "+ nextToken);
} while (nextToken != null && i < 20);
return items;
} catch (IOException e) {
Log.d("YC", "Could not search: " + e);
return null;
}
}
}
我希望这对你有所帮助。
答案 2 :(得分:0)
您可以传递nextpagetoken页面并将其作为参数提供给pagetoken
这将显示nex页面我写了一个vardamp来向你显示页面令牌不一样只需复制此代码并运行它并确保你已将api资源文件夹放在插件的同一文件夹中< / p>
<?php
function doit(){if (isset($_GET['q']) && $_GET['maxResults'] ) {
// Call set_include_path() as needed to point to your client library.
// require_once ($_SERVER["DOCUMENT_ROOT"].'/API/youtube/google-api-php-client/src/Google_Client.php');
// require_once ($_SERVER["DOCUMENT_ROOT"].'/API/youtube/google-api-php-client/src/contrib/Google_YouTubeService.php');
set_include_path("./google-api-php-client/src");
require_once 'Google_Client.php';
require_once 'contrib/Google_YouTubeService.php';
/* Set $DEVELOPER_KEY to the "API key" value from the "Access" tab of the
Google APIs Console <http://code.google.com/apis/console#access>
Please ensure that you have enabled the YouTube Data API for your project. */
$DEVELOPER_KEY = 'AIzaSyCgHHDrx5ufQlkXcSc8nm5uqrsNdXizbMs';
// the old one AIzaSyDOkg-u9jnhP-WnzX5WPJyV1sc5QQrtuyc
$client = new Google_Client();
$client->setDeveloperKey($DEVELOPER_KEY);
$youtube = new Google_YoutubeService($client);
try {
$searchResponse = $youtube->search->listSearch('id,snippet', array(
'q' => $_GET['q'],
'maxResults' => $_GET['maxResults'],
));
var_dump($searchResponse);
$searchResponse2 = $youtube->search->listSearch('id,snippet', array(
'q' => $_GET['q'],
'maxResults' => $_GET['maxResults'],
'pageToken' => $searchResponse['nextPageToken'],
));
var_dump($searchResponse2);
exit;
$videos = '';
$channels = '';
foreach ($searchResponse['items'] as $searchResult) {
switch ($searchResult['id']['kind']) {
case 'youtube#video':
$videoId =$searchResult['id']['videoId'];
$title = $searchResult['snippet']['title'];
$publishedAt= $searchResult['snippet']['publishedAt'];
$description = $searchResult['snippet']['description'];
$iamge_url = $searchResult['snippet'] ['thumbnails']['default']['url'];
$image_high = $searchResult['snippet'] ['thumbnails']['high']['url'];
echo '<div class="souligne" id="'.$videoId.'">
<div >
<a href=http://www.youtube.com/watch?v='.$videoId.' target=_blank" >
<img src="'.$iamge_url .'" width ="150px" />
</a>
</div>
<div class="title">'.$title.'</div>
<div class="des"> '.$description.' </div>
<a id="'.$videoId.'" onclick="supp(this)" class="linkeda">
+ ADD
</a>
</div>'
;
break;
}
}
echo ' </ul></form>';
} catch (Google_ServiceException $e) {
$htmlBody .= sprintf('<p>A service error occurred: <code>%s</code></p>',
htmlspecialchars($e->getMessage()));
} catch (Google_Exception $e) {
$htmlBody .= sprintf('<p>An client error occurred: <code>%s</code></p>',
htmlspecialchars($e->getMessage()));
}
}}
doit();
?>
<!doctype html>
<html>
<head>
<title>YouTube Search</title>
<link href="//www.w3resource.com/includes/bootstrap.css" rel="stylesheet">
<style type="text/css">
body{margin-top: 50px; margin-left: 50px}
</style>
</head>
<body>
<form method="GET">
<div>
Search Term: <input type="search" id="q" name="q" placeholder="Enter Search Term">
</div>
<div>
Max Results: <input type="number" id="maxResults" name="maxResults" min="1" max="1000000" step="1" value="25">
</div>
<div>
page: <input type="number" id="startIndex" name="startIndex" min="1" max="50" step="1" value="2">
</div>
<input type="submit" value="Search">
</form>
<h3>Videos</h3>
<ul><?php if(isset($videos))echo $videos; ?></ul>
<h3>Channels</h3>
<ul><?php if(isset($channels)) echo $channels; ?></ul>
</body>
</html>