import {SampleComponent} from "../SampleComponent";
<div>
<SampleComponent onClick = {?????????}/>
</div>
我有以下熊猫数据框
[python 3.5.2, pandas 0.24.1, numpy 1.16.1, scipy 1.2.0]
这些是我正在运行的步骤
data_pd
nrows: 1,032,749,584
cols: ['mem_id':np.uint32, 'offset':np.uint16 , 'ctype':string, 'code':string]
obsmap_pd
nrows: 10,887,542
cols: ['mem_id':np.uint32, 'obs_id':np.uint32]
(obs_id has consecutive integers between 0 and obsmap_pd nrows)
varmap_pd
nrows: 4,596
cols: ['ctype':string, 'code': string, 'var_id':np.uint16]
(var_id has consecutive integers between 0 and varmap_pd nrows)
这样做的目的是在下一步中创建一个scipy csc_matrix
***
sparse_pd = data_pd.groupby(['mem_id','ctype','code'])['offset'].nunique().reset_index(name='value')
sparse_pd['value'] = sparse_pd['value'].astype(np.uint16)
sparse_pd = pd.merge(pd.merge(sparse_pd, obsmap_pd, on='mem_id', sort=False),
varmap_pd, on=['ctype','code'], sort=False)[['obs_id','var_id','value']]
***
创建csc_matrix的速度非常快,但是带有熊猫代码的三行代码(***之间)需要25.7分钟。关于如何加快速度的任何想法?