在多个核心上并行运行Map()?

时间:2018-05-09 08:21:54

标签: r parallel-processing

给定一个包含2列JSON列表的数据框,如下所示:

col1
"[34454,31843,33239,34885,34352,42294,26603,55449,34710,49235,37558,82942,38028,58170,42772,27700,28464,28917,31902,41082,37089,67366,48616,46706,31464,30308,33832,30485,50838,38225,59758,47311,33747,32972,34903,32729,52274,38895,45899,32511,31189,33756,32684,47994,38019,31623,31751,32144,32398,44909,36542,160950,59383,53569,39797,36029,15031,15031,113913,33540,106166,62518,36407,44354,36237,47610]"

...

col2
"[14,7,5,3,4,0,1,7,2,3,1,18,13,4,23,7,8,8,11,18,15,6,2,10,2,4,8,5,11,5,1,5,2,4,3,1,6,8,5,5,3,1,1,4,5,2,9,3,4,11,11,14,3,12,2,6,0,0,15,1,18,5,3,6,6,6]"

我给出了一行的例子,但实际上我有数百万行col1和col2。

我想应用以下公式:col1/number1 / col2/ number2 其中number1number2只是1,3,4 ...

之类的数字

以上是我到目前为止所做的工作,我需要您的帮助,使用parallel::parSapply等多核并行计算更快地完成这项工作。

Map(`/`, lapply(df$col1, function(i) {fromJSON(i) / number1}), 
              lapply(df$col2, function(i) {fromJSON(i) / number2}))

请告知如何使用我的核心运行Map或任何模拟但并行运行。

0 个答案:

没有答案