在R中组合多个MySQL表的最佳方法是什么?例如,我需要rbind
14个大的MySQL表(每个> 100k行乘100列)。我尝试了以下方法,它耗费了大部分内存并从MySQL中获得了时间。我想知道是否有替代解决方案?我不需要获取整个表,只需要通过几个变量对整个表进行分组并计算一些指标。
station_tbl_t <- dbSendQuery(my_db, "select * from tbl_r3_300ft
union all
select * from tbl_r4_350ft
union all
select * from tbl_r5_400ft
union all
select * from tbl_r6_500ft
union all
select * from tbl_r7_600ft
union all
select * from tbl_r8_700ft
union all
select * from tbl_r9_800ft
union all
select * from tbl_r10_900ft
union all
select * from tbl_r11_1000ft
union all
select * from tbl_r12_1200ft
union all
select * from tbl_r13_1400ft
union all
select * from tbl_r14_1600ft
union all
select * from tbl_r15_1800ft
union all
select * from tbl_r16_2000ft
")
答案 0 :(得分:2)
考虑迭代导入MySQL表数据,然后用R进行行绑定。并确保选择所需的列以节省开销:
SELECT i.id, i.item_id, v.item_to_map_id,
COALESCE( SUM(CAST(CAST(v.score AS char) AS SIGNED)), 0 ) AS score
FROM item_to_map i LEFT JOIN
vote_item v
ON i.id = v.item_to_map_id
GROUP BY i.id, i.item_id, v.item_to_map_id;