我有以下Hive脚本,我在16节点集群上运行。执行需要四个小时;有没有机会减少这个时间?这是查询:
INSERT INTO ${DestDBName}.${COUNTRY}SA_PRODUCT
SELECT '${COUNTRY}', PRDYR.YEAR, PRD.PERIOD_TYPE, RTS.PRODUCT_ID, RTS.SALES_UNITS
FROM ${DBNAME}.RETAIL_TREND_SALES RTS
JOIN ${DBNAME}.PERIOD PRD
ON RTS.PERIOD_ID = PRD.PERIOD_ID
JOIN ${DBNAME}.PERIOD_DETAIL PRDYR
ON RTS.PERIOD_ID = PRDYR.PERIOD_ID;
INSERT INTO ${DestDBName}.${COUNTRY}SA_STORE
SELECT '${COUNTRY}', PRDYR.YEAR, PRD.PERIOD_TYPE, MS.STORE_ID
FROM ${DBNAME}.MARKET_STORE MS
JOIN ${DBNAME}.PERIOD PRD
ON MS.PERIOD_ID = PRD.PERIOD_ID
JOIN ${DBNAME}.PERIOD_DETAIL PRDYR
ON MS.PERIOD_ID = PRDYR.PERIOD_ID; >> ${DataLog}
注意:这是在后端查询数TB的数据。