Question

根据pagination文档，您可以通过定义特定的表来对结果进行分页。但是，如何在查询中添加分页呢？例如，如果我有以下查询：

    client = bigquery.Client(location='US')
    job_config = bigquery.QueryJobConfig()
    job_config.query_parameters = params
    result = client.query(query, job_config=job_config)

如何对这个查询进行分页以获取10到20的行？

Answer 1

您没有发布查询，但我想您正在寻找LIMIT 10 OFFSET 10

Answer 2

您的意思是这样吗，您将限制和偏移量作为参数传递给查询？

from google.cloud import bigquery
client = bigquery.Client(project="project")

query = """SELECT name FROM `dataset` LIMIT @limit OFFSET @offset"""

limit = 10
offset = 11

params=[
   bigquery.ScalarQueryParameter('limit', 'INT64', limit), bigquery.ScalarQueryParameter('offset', 'INT64', offset)
]

job_config=bigquery.QueryJobConfig()
job_config.query_parameters=params
query_job=client.query(query,job_config=job_config)

for row in query_job:
    print("{}".format(row.name))

Answer 3

您可以通过两种方式使用Big Query JOBS来实现它

String query="big query here...";
    int startIndex=10;
    int maxResults=10;

    //fetches rows numbered 10 to 20 from result set            
    resultCollection = getPaginatedResultCollection(query, startIndex,maxResults);      
    //NOTE: Do what you want to do with paged data result   i.e. resultCollection                           



/**
 * Polls the status of a BigQuery job, returns TableReference to results if
 * "DONE"
 */
private static TableReference checkQueryResults(Bigquery bigquery, String projectId, JobReference jobId) throws IOException, InterruptedException {
    // Variables to keep track of total query time
    while (true) {
        Job pollJob = bigquery.jobs().get(projectId, jobId.getJobId()).execute();
        if (pollJob.getStatus().getState().equals("DONE")) {
            return pollJob.getConfiguration().getQuery().getDestinationTable();
        }
        // Pause execution for one second before polling job status again,
        // to
        // reduce unnecessary calls to the BigQUery API and lower overall
        // application bandwidth.
        // Thread.sleep(1000);
    }
}


/**
 * @param bigquery
 * @param completedJob
 * @param startIndex
 * @param maxResultsPerPage
 * @return
 * @throws Exception 
 */
private static ResultCollection displayQueryResults(Bigquery bigquery, TableReference completedJob, int startIndex, Integer maxResultsPerPage) throws Exception {

    maxResultsPerPage = (maxResultsPerPage==null)? 20:maxResultsPerPage;
    JSONObject responseMap = new JSONObject();
    List<JSONObject> resultArray = new ArrayList<JSONObject>();
    TableDataList queryResult = null;
    queryResult = bigquery.tabledata().list(completedJob.getProjectId(), completedJob.getDatasetId(), completedJob.getTableId())
            .setMaxResults(new Long(maxResultsPerPage))
            .setStartIndex(BigInteger.valueOf(startIndex))
            .execute(); 


        //Table table = bigquery.tables().get(completedJob.getProjectId(), completedJob.getDatasetId(), completedJob.getTableId()).execute();
        //NOTE: Schema can be read from table.getSchema().getFields()
        if (CollectionUtils.isNotEmpty(queryResult.getRows())) {
            //NOTE: read result data from queryResult.getRows() and transform the way you want to get them modeled, say resultCollection, for now
        }

    return resultCollection;
}

如何在Bigquery查询上分页结果

3 个答案: