将字符串二进制转换为单个位python dataframe

时间:2017-10-16 17:07:31

标签: python python-3.x pandas bit-manipulation bit

我有一个看起来像这样的pandas数据框,并重复大约10k行:

     Lbl #             Value                          Time
16    160   0-00-000-0000-0000-0000-0000-00    000:00:00:00.206948   
17    270   0-00-000-0000-0001-1010-0110-00    000:00:00:00.212948   
18    271   1-00-000-0000-0000-0110-1110-00    000:00:00:00.215828   
19    272   0-00-001-1000-0111-1111-1000-00    000:00:00:00.218708   
20    273   1-00-000-0000-0000-0111-1110-00    000:00:00:00.221588   
21    274   0-00-000-0000-0000-1001-0110-00    000:00:00:00.224468   
22    275   0-00-001-1111-0000-0000-0000-00    000:00:00:00.227348   
23    276   1-00-000-0000-0000-0000-0000-00    000:00:00:00.233428   
24    277   0-00-000-0000-0000-0000-0000-00    000:00:00:00.236308   
29    334   0-11-000-0000-0000-0000-0000-00    000:00:00:00.253900   
63    160   0-00-000-0000-0000-0000-0000-00    000:00:00:00.458692  

我如何进入每个'价值'标签并将其分解为24个相应的位。最终游戏是能够在数据文件的过程中绘制标签160,位19,以及其他一些分析。

感谢。

编辑:MaxU的答案奏效了。仅为未来的访问者,我最终得到的最终代码是:

df_bits = df_binary.Value.str.replace('-','').str.extractall('(\d)').unstack().astype(np.int8).add_prefix('b')
df_binary = pd.concat([df_binary, df_bits], axis = 1)

1 个答案:

答案 0 :(得分:3)

IIUC:

In [43]: df
Out[43]:
    Lbl    #                            Value                 Time
0    16  160  0-00-000-0000-0000-0000-0000-00  000:00:00:00.206948
1    17  270  0-00-000-0000-0001-1010-0110-00  000:00:00:00.212948
2    18  271  1-00-000-0000-0000-0110-1110-00  000:00:00:00.215828
3    19  272  0-00-001-1000-0111-1111-1000-00  000:00:00:00.218708
4    20  273  1-00-000-0000-0000-0111-1110-00  000:00:00:00.221588
5    21  274  0-00-000-0000-0000-1001-0110-00  000:00:00:00.224468
6    22  275  0-00-001-1111-0000-0000-0000-00  000:00:00:00.227348
7    23  276  1-00-000-0000-0000-0000-0000-00  000:00:00:00.233428
8    24  277  0-00-000-0000-0000-0000-0000-00  000:00:00:00.236308
9    29  334  0-11-000-0000-0000-0000-0000-00  000:00:00:00.253900
10   63  160  0-00-000-0000-0000-0000-0000-00  000:00:00:00.458692

In [44]: df.Value.str.replace('-','').str.extractall('(\d)').unstack().astype(np.int8).add_prefix('b')
Out[44]:
      b0                            ...
match b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ... b14 b15 b16 b17 b18 b19 b20 b21 b22 b23
0      0  0  0  0  0  0  0  0  0  0 ...   0   0   0   0   0   0   0   0   0   0
1      0  0  0  0  0  0  0  0  0  0 ...   1   0   1   0   0   1   1   0   0   0
2      1  0  0  0  0  0  0  0  0  0 ...   0   1   1   0   1   1   1   0   0   0
3      0  0  0  0  0  1  1  0  0  0 ...   1   1   1   1   1   0   0   0   0   0
4      1  0  0  0  0  0  0  0  0  0 ...   0   1   1   1   1   1   1   0   0   0
5      0  0  0  0  0  0  0  0  0  0 ...   1   0   0   1   0   1   1   0   0   0
6      0  0  0  0  0  1  1  1  1  1 ...   0   0   0   0   0   0   0   0   0   0
7      1  0  0  0  0  0  0  0  0  0 ...   0   0   0   0   0   0   0   0   0   0
8      0  0  0  0  0  0  0  0  0  0 ...   0   0   0   0   0   0   0   0   0   0
9      0  1  1  0  0  0  0  0  0  0 ...   0   0   0   0   0   0   0   0   0   0
10     0  0  0  0  0  0  0  0  0  0 ...   0   0   0   0   0   0   0   0   0   0

[11 rows x 24 columns]