在R中组合多个MySQL表的最佳方法

时间:2017-07-04 23:21:10

标签: mysql r dplyr rmysql

在R中组合多个MySQL表的最佳方法是什么?例如,我需要rbind 14个大的MySQL表(每个> 100k行乘100列)。我尝试了以下方法,它耗费了大部分内存并从MySQL中获得了时间。我想知道是否有替代解决方案?我不需要获取整个表,只需要通过几个变量对整个表进行分组并计算一些指标。

station_tbl_t <- dbSendQuery(my_db, "select * from tbl_r3_300ft
                  union all
                  select * from tbl_r4_350ft
                  union all
                  select * from tbl_r5_400ft
                  union all
                  select * from tbl_r6_500ft
                  union all
                  select * from tbl_r7_600ft
                  union all
                  select * from tbl_r8_700ft
                  union all
                  select * from tbl_r9_800ft
                  union all
                  select * from tbl_r10_900ft
                  union all
                  select * from tbl_r11_1000ft
                  union all
                  select * from tbl_r12_1200ft
                  union all
                  select * from tbl_r13_1400ft
                  union all
                  select * from tbl_r14_1600ft
                  union all
                  select * from tbl_r15_1800ft
                  union all
                  select * from tbl_r16_2000ft
                  ")

1 个答案:

答案 0 :(得分:2)

考虑迭代导入MySQL表数据,然后用R进行行绑定。并确保选择所需的列以节省开销:

SELECT i.id, i.item_id, v.item_to_map_id,
       COALESCE( SUM(CAST(CAST(v.score AS char) AS SIGNED)), 0 ) AS score
FROM item_to_map i LEFT JOIN
     vote_item v
     ON i.id = v.item_to_map_id
GROUP BY i.id, i.item_id, v.item_to_map_id;