嗨,我想出于某种分析目的将转换逻辑转换为数据流。以下是我要转换为数据流作业的查询。
select max(trans.TRANSACTIONS_ID) TRANSACTION_ID
, trans.ACCOUNT_ID ACCOUNT_ID
, max(dim.ACCOUNT_NAME) ACCOUNT_NAME
, max(trans.DATE_ID) DATE_ID
, max(trans.CR_DR_INDICATOR) CR_DR_INDICATOR
, max(trans.TRANS_CODE) TRANS_CODE
, SUM(trans.AMOUNT) AMOUNT
, max(trans.BALANCE) BALANCE
, max(trans.TRANSACTION_TYPE) TRANSACTION_TYPE
, max(trans.BANK) BANK
, max(trans.ACCOUNT) ACCOUNT
from `xxxxxxx.costing_uscase.TRANSACTIONS_MASTER_DATAFLOW_TEST` trans, `xxxxxxxxxxx.costing_uscase.ACCOUNTS_MASTER_DATAFLOW_TEST_2` dim
where dim.ACCOUNT_ID = trans.ACCOUNT_ID
group by trans.ACCOUNT_ID;
我已经使用BigQueryTableIO.read从两个表中读取数据,并使用了CoGroupBy,但是对于如何在可迭代的表行上执行聚合操作有些困惑。