我有一个要并行化的程序。它有一个巨大的数据框,我想将三列(整数/长整数)包装成multiprocessing.sharedctypes。
import pandas as pd
from io import StringIO
s = """Chromosome Start End Name Score Strand
0 chr1 9916 10115 HWI-ST216_313:3:1203:10227:6568 1 -
1 chr1 9939 10138 HWI-ST216_313:3:2301:15791:16298 1 -
2 chr1 9951 10150 HWI-ST216_313:3:2205:20086:33508 1 -
3 chr1 9953 10152 HWI-ST216_313:3:1305:6975:102491 1 -
4 chr1 9978 10177 HWI-ST216_313:3:1204:5599:113305 1 -
5 chr1 10001 10200 HWI-ST216_313:3:1102:14019:151362 1 -
6 chr1 10024 10223 HWI-ST216_313:3:2201:5209:155139 1 -
7 chr1 10127 10326 HWI-ST216_313:3:2207:7406:122346 1 -
8 chr1 10241 10440 HWI-ST216_313:3:1302:4516:156396 1 -
9 chr1 10246 10445 HWI-ST216_313:3:1207:4315:142177 1 -
10 chr1 10362 10561 HWI-ST216_313:3:2105:9987:89676 1 -
11 chr1 14742 14941 HWI-ST216_313:3:1305:10541:157562 1 -
12 chr1 14778 14977 HWI-ST216_313:3:1301:12144:2578 1 -
13 chr1 15050 15249 HWI-ST216_313:3:2305:12846:175438 1 -
14 chr1 17310 17509 HWI-ST216_313:3:1102:12936:74441 1 -
15 chr1 19073 19272 HWI-ST216_313:3:1106:16557:31054 1 -
16 chr1 19959 20158 HWI-ST216_313:3:1205:9775:191873 1 -
17 chr1 56119 56318 HWI-ST216_313:3:1103:16077:79427 1 -
18 chr1 63481 63680 HWI-ST216_313:3:2204:10232:155351 1 -
19 chr1 69386 69585 HWI-ST216_313:3:1106:2845:84540 1 -"""
df = pd.read_table(StringIO(s), index_col=0, sep="\s+", header=0)
我从这里去哪里?我要包装的列为pd.index.values
,df[["Start", "End"]].values
的形状为(3,20)。
最后,我希望能够在不复制任何数据的情况下并行处理这些操作。