Google Cloud Talent解决方案:如何使用page_token

时间:2019-06-27 08:22:55

标签: python google-cloud-talent-solution

我正在尝试使用GCTS的v4beta1-search_jobs()

文档: https://cloud.google.com/talent-solution/job-search/docs/reference/rest/v4beta1/projects.jobs/search

有对参数pageToken的引用,但是在\google\cloud\talent_v4beta1\gapic\job_service_client.py中,函数定义中没有这样的参数:

def search_jobs(
    self,
    parent,
    request_metadata,
    search_mode=None,
    job_query=None,
    enable_broadening=None,
    require_precise_result_size=None,
    histogram_queries=None,
    job_view=None,
    offset=None,
    page_size=None,
    order_by=None,
    diversification_level=None,
    custom_ranking_info=None,
    disable_keyword_match=None,
    retry=google.api_core.gapic_v1.method.DEFAULT,
    timeout=google.api_core.gapic_v1.method.DEFAULT,
    metadata=None,
):

在注释中提到了page_token,例如Offset参数。

如何为求职指定页面令牌?

我已指定require_precise_result_size=False,但返回值不包含SearchJobsResponse.estimated_total_size。这是否表明没有将search_jobs()设置为所需的“模式”?

1 个答案:

答案 0 :(得分:0)

我相信python客户端库会为您提取pageToken。如果您深入到源代码中search_jobs方法的末尾,您会看到它会构建一个知道pageToken和nextPageToken字段的迭代器:

        iterator = google.api_core.page_iterator.GRPCIterator(
        client=None,
        method=functools.partial(
            self._inner_api_calls["search_jobs"],
            retry=retry,
            timeout=timeout,
            metadata=metadata,
        ),
        request=request,
        items_field="matching_jobs",
        request_token_field="page_token",
        response_token_field="next_page_token",
    )
    return iterator

因此,您所需要做的就是以下操作-从https://googleapis.github.io/google-cloud-python/latest/talent/gapic/v4beta1/api.html的文档中复制过来:

from google.cloud import talent_v4beta1

client = talent_v4beta1.JobServiceClient()
parent = client.tenant_path('[PROJECT]', '[TENANT]')

# TODO: Initialize `request_metadata`:
request_metadata = {}

# Iterate over all results
for element in client.search_jobs(parent, request_metadata):
    # process element
    pass


# Alternatively:
# Iterate over results one page at a time
for page in client.search_jobs(parent, request_metadata).pages:
    for element in page:
        # process element
        pass

默认页面大小显然是10,您可以使用pageSize参数对其进行修改。可以在这里找到页面迭代器文档:

Doco:https://googleapis.github.io/google-cloud-python/latest/core/page_iterator.html

来源:https://googleapis.github.io/google-cloud-python/latest/_modules/google/api_core/page_iterator.html#GRPCIterator

可能最简单的处理方法是使用消耗所有结果

allResults = list(results_iterator)

如果您有大量数据,又不想一次翻页,我将执行以下操作。 “ .pages”只是返回一个您可以照常使用的生成器。

resultsIterator = client.search_jobs(parent, request_metadata)
pages = resultsIterator.pages
currentPageIter = next(pages)
#do work with page
currentItem = next(currentPageIter)

currentPageIter = next(pages)
# etc...

当项目或页面用完时,您需要捕获StopIteration错误:

https://anandology.com/python-practice-book/iterators.html

这就是为什么:

def _page_iter(self, increment):
    """Generator of pages of API responses.

    Args:
        increment (bool): Flag indicating if the total number of results
            should be incremented on each page. This is useful since a page
            iterator will want to increment by results per page while an
            items iterator will want to increment per item.

    Yields:
        Page: each page of items from the API.
    """
    page = self._next_page()
    while page is not None:
        self.page_number += 1
        if increment:
            self.num_results += page.num_items
        yield page
        page = self._next_page()

看看收益率如何调用_next_page?这将检查更多页面,然后如果存在则为您执行另一个请求。

def _next_page(self):
    """Get the next page in the iterator.

    Returns:
        Page: The next page in the iterator or :data:`None` if
            there are no pages left.
    """
    if not self._has_next_page():
        return None

    if self.next_page_token is not None:
        setattr(self._request, self._request_token_field, self.next_page_token)

    response = self._method(self._request)

    self.next_page_token = getattr(response, self._response_token_field)
    items = getattr(response, self._items_field)
    page = Page(self, items, self.item_to_value)

    return page

如果要使用无会话选项,则可以使用offset +页面大小,并在每个ajax请求中将当前偏移量传递给用户:

  

偏移量(int)–

     

可选。一个整数,指定当前偏移量(即,   起始结果位置,在API视为   相关)。仅在以下情况下才考虑该字段   page_token未设置。

     

例如,0表示从第一个开始返回结果   匹配的工作,而10表示从第11个工作返回。这可以是   用于分页(例如,pageSize = 10且offset = 10表示   从第二页返回。