我正在开发一个应用程序,使用Jobs.Query类在bigquery中执行许多查询。这是我的一段代码:
public JObject getData(BigqueryService service, String query)
{
JObject result = new JObject();
JobsResource j = service.Jobs;
QueryRequest qr = new QueryRequest();
qr.Query = query;
QueryResponse response = j.Query(qr, projectId).Execute();
if ((bool)response.JobComplete)
{
return getResults(response.Schema, response.Rows);
}
else
{
DateTime start = DateTime.UtcNow;
while (true) {
GetQueryResultsResponse response2 = service.Jobs.GetQueryResults(projectId, response.JobReference.JobId).Execute();
if ((bool)response2.JobComplete)
{
return getResults(response2.Schema, response2.Rows);
}
DateTime end = DateTime.UtcNow;
TimeSpan total = new TimeSpan(end.Ticks - start.Ticks);
if (total.Minutes > 0 || total.Seconds > 50) return getResults(null, null);
Thread.Sleep(1000);
}
}
}
我的问题是查询执行对于数据大小来说非常慢。有没有办法提高查询速度?
我说的是一张包含20多万行的表。
今天,我已经完成了一个包含查询时间戳的列表。每个时间戳是执行4个查询的时间。
-00:00:06.2929905 -00:00:06.8925675 -00:00:05.0319329 -00:00:05.6336228 -00:00:07.2206028 -00:00:05.2911213 -00:00:05.0546701 -00:00:04.3276083 -00:00:05.7575818 -00:00:04.1528799 -00:00:05.2664854 -00:00:05.0738185 -00:00:05.5472279 -00:00:05.1223429 -00:00:04.7509914 -00:00:04.9643928 -00:00:04.5182521 -00:00:04.6950590 -00:00:06.0061839 -00:00:06.7736054 -00:00:06.3931505 -00:00:06.0068689 -00:00:07.2904883 -00:00:04.3762012 -00:00:09.7467363 -00:00:12.9430536 -00:00:11.4525429 -00:00:13.4580112 -00:00:07.2501061 -00:00:11.9368635 -00:00:20.0649572 -00:00:22.8073734 -00:00:33.5651125 -00:00:20.3412234 -00:00:41.2743429 -00:00:46.3231917 -00:01:03.8191158 -00:00:33.4181420 -00:00:42.9427840 -00:00:26.3853840 -00:00:19.3615288 -00:00:20.6219836 -00:00:23.1905747
执行更多查询时,BigQuery运行速度会更慢吗?
由于
22/08/2014编辑:几天后,我注意到,根据一天中的时间,查询运行得更慢,速度更快。这些查询可能运行缓慢,因为源表是通过InsertAll()(流数据)填充的吗?
有什么方法可以更快地运行此查询? (总是4-10秒)