将pandas Series / DataFrame转换为numpy矩阵,从索引中解压缩坐标

时间:2015-06-16 22:40:09

标签: numpy matrix pandas type-conversion

我有一个熊猫系列:

A   1
B   2
C   3
AB  4
AC  5
BA  4
BC  8
CA  5
CB  8

转换为矩阵的简单代码:

1 4 5
4 2 8
5 8 3

一些相当动态和内置的东西,而不是很多循环来解决这个3x3问题。

2 个答案:

答案 0 :(得分:2)

你可以这样做。

import pandas as pd

# your raw data
raw_index = 'A B C AB AC BA BC CA CB'.split()
values = [1, 2, 3, 4, 5, 4, 8, 5, 8]

# reformat index
index = [(a[0], a[-1]) for a in raw_index]
multi_index = pd.MultiIndex.from_tuples(index)

df = pd.DataFrame(values, columns=['values'], index=multi_index)
df.unstack()


df.unstack()
Out[47]: 
  values      
       A  B  C
A      1  4  5
B      4  2  8
C      5  8  3

答案 1 :(得分:0)

对于pd.DataFrame 使用.values成员或.to_records(...)方法

对于pd.Series 使用.unstack()方法作为Jianxun Li说

import numpy as np
import pandas as pd

d = pd.DataFrame(data = {
    'var':['A','B','C','AB','AC','BA','BC','CA','CB'],
    'val':[1,2,3,4,5,4,8,5,8] })

# Here are some options for converting to np.matrix ...
np.matrix( d.to_records(index=False) )
# matrix([[(1, 'A'), (2, 'B'), (3, 'C'), (4, 'AB'), (5, 'AC'), (4, 'BA'),
#         (8, 'BC'), (5, 'CA'), (8, 'CB')]], 
#       dtype=[('val', '<i8'), ('var', 'O')])

# Here you can add code to rearrange it, e.g.
[(val, idx[0], idx[-1]) for val,idx in d.to_records(index=False) ]
# [(1, 'A', 'A'), (2, 'B', 'B'), (3, 'C', 'C'), (4, 'A', 'B'), (5, 'A', 'C'), (4, 'B', 'A'), (8, 'B', 'C'), (5, 'C', 'A'), (8, 'C', 'B')]

# and if you need numeric row- and col-indices:
[ (val, 'ABCDEF...'.index(idx[0]), 'ABCDEF...'.index(idx[-1]) ) for val,idx in d.to_records(index=False) ]
# [(1, 0, 0), (2, 1, 1), (3, 2, 2), (4, 0, 1), (5, 0, 2), (4, 1, 0), (8, 1, 2), (5, 2, 0), (8, 2, 1)]

# you can sort by them:
sorted([ (val, 'ABCDEF...'.index(idx[0]), 'ABCDEF...'.index(idx[-1]) ) for val,idx in d.to_records(index=False) ], key=lambda x: x[1:2] )