https://cloud.google.com/bigquery/docs/creating-partitioned-tables显示了如何在Python中创建分区表。我去过那里,我已经做到了。
现在的问题是,如何使用Java API做同样的事情?什么是相应的Java代码与下面的Python代码相同:
tagNames = [];
tagNames.push('61');
cmt_wrds = '61'.replace(/[`~!@#$%^&*()_|+\-=?;:'",،؛«».<>\{\}\[\]\\\/]/gi, ' ').match(/\S+/g);
if ( tagNames[0] == cmt_wrds[0] ) { // issue is here
console.log('yes'); // --> nothing
};
console.log(tagNames[0].length);
console.log(cmt_wrds[0].length);
缺少分区的Java:
{
"tableReference": {
"projectId": "myProject",
"tableId": "table1",
"datasetId": "mydataset"
},
"timePartitioning": {
"type": "DAY"
}
}
我正在使用Maven Central Repository的最新api版本:Job createTableJob = new Job();
JobConfiguration jobConfiguration = new JobConfiguration();
JobConfigurationLoad loadConfiguration = new JobConfigurationLoad();
createTableJob.setConfiguration(jobConfiguration);
jobConfiguration.setLoad(loadConfiguration);
TableReference tableReference = new TableReference()
.setProjectId("myProject")
.setDatasetId("mydataset")
.setTableId("table1");
loadConfiguration.setDestinationTable(tableReference);
// what should be place here to set DAY timePartitioning?
。
答案 0 :(得分:3)
https://cloud.google.com/bigquery/docs/reference/v2/tables/insert https://cloud.google.com/bigquery/docs/reference/v2/tables#resource
示例Java代码:
String projectId = "";
String datasetId = "";
Table content = new Table();
TimePartitioning timePartitioning = new TimePartitioning();
timePartitioning.setType("DAY");
timePartitioning.setExpirationMs(1L);
content.setTimePartitioning(timePartitioning);
Bigquery.Tables.Insert request = bigquery.tables().insert(projectId, datasetId, content);
Table response = request.execute();
答案 1 :(得分:3)
请让我分享更新的方法来创建分区表(适用于Java API 0.32):
Schema schema = Schema.of( newFields);
TimePartitioning timePartitioning = TimePartitioning.of(TimePartitioning.Type.DAY);
TableDefinition tableDefinition = StandardTableDefinition.newBuilder()
.setSchema(schema)
.setTimePartitioning(timePartitioning)
.build();
TableId tableId = TableId.of(projectName, datasetName, tableName)
TableInfo tableInfo = TableInfo.newBuilder( tableId, tableDefinition).build();
bigQuery.create( tableInfo);
19/03/2018更新:
要将某些数据加载到特定分区(或将结果作为Select插入特定分区),您只需将该分区的日期(使用后缀:$ yyyymmdd)添加到表的名称中构造 TableId 对象时。这是一个例子:
private void runJob(JobConfiguration jobConf) {
BIG_QUERY.create(JobInfo.of(jobConf));
}
private TableId getTableToOverwrite(String tableToOverwrite, String partition) {
return TableId.of(PROJECT, DATASET, tableToOverwrite + "$" + partition);
}
void loadInDayPartition(String dayUrl, String dayPartition) {
LoadJobConfiguration loadConf = LoadJobConfiguration.newBuilder(getTableToOverwrite(TABLE_LEGACY, dayPartition),
dayUrl, FormatOptions.avro())
.build();
runJob(loadConf);
}
我没有任何示例可以将数据流插入到分区表中,但我猜它是相似的。
答案 2 :(得分:0)
如果要按字段分区,代码将如下所示。
Schema schema = Schema.of( fields);
Builder timeParitioningBuilder = TimePartitioning.newBuilder(TimePartitioning.Type.DAY);
timeParitioningBuilder.setField("partition_column");
TableDefinition tableDefinition = StandardTableDefinition.newBuilder()
.setSchema(schema)
.setTimePartitioning(timePartitioning)
.build();
TableId tableId = TableId.of(projectName, datasetName, tableName)
TableInfo tableInfo = TableInfo.newBuilder( tableId, tableDefinition).build();
bigQuery.create( tableInfo);