numpy数组到pandas数据帧转换 - ValueError

时间:2018-02-22 18:30:43

标签: python arrays pandas numpy dataframe

我有以下numpy数组,名为' data' -

array([['ksr-usconeng101', 'C', '632.3', '1'],
       ['ksr-usconeng101', 'D', '242.9', '2'],
       ['ksr-usconeng158', 'C', '1044.5', '3'],
       ['ksr-usconeng158', 'D', '2771.2', '4'],
       ['ksr-usconeng158', 'G', '7.3', '5'],
       ['ksr-usconeng163', 'C', '1597.0', '6'],
       ['ksr-usconeng163', 'D', '1676.3', '7'],
       ['server', 'drive', 'size', '']],
      dtype='<U15')

我试图将其转换为数据框 -

pd.DataFrame(data=data[0:-1,0:3],
                   index = data[0:-1,-1],
                   columns = data[-1:, 0:-1])

数据 -

data[0:-1,0:3]
Out[145]: 
array([['ksr-usconeng101', 'C', '632.3'],
       ['ksr-usconeng101', 'D', '242.9'],
       ['ksr-usconeng158', 'C', '1044.5'],
       ['ksr-usconeng158', 'D', '2771.2'],
       ['ksr-usconeng158', 'G', '7.3'],
       ['ksr-usconeng163', 'C', '1597.0'],
       ['ksr-usconeng163', 'D', '1676.3']],
      dtype='<U15')

指数 -

data[0:-1,-1]
Out[146]: 
array(['1', '2', '3', '4', '5', '6', '7'],
      dtype='<U15')

列 -

data[-1:, 0:-1]
Out[147]: 
array([['server', 'drive', 'size']],
      dtype='<U15')

然而,python并不同意并以 -

回应
ValueError: Shape of passed values is (3, 7), indices imply (1, 7)

请建议我错过了什么..

3 个答案:

答案 0 :(得分:1)

列必须是1D:

df = pd.DataFrame(data=data[:-1,:3],
                  index=data[:-1,-1],
                  columns=data[-1, :-1])
print(df)

输出:

         server drive    size
1  ksr-usconeng101     C   632.3
2  ksr-usconeng101     D   242.9
3  ksr-usconeng158     C  1044.5
4  ksr-usconeng158     D  2771.2
5  ksr-usconeng158     G     7.3
6  ksr-usconeng163     C  1597.0
7  ksr-usconeng163     D  1676.3

你有:

>>> data[-1:, 0:-1].shape
(1, 3)

但需要:

>>> data[-1, :-1].shape
(3,)

答案 1 :(得分:0)

试试这个

pd.DataFrame(data=data[0:-1,0:3],
                   index = data[0:-1,-1],
                   columns = data[-1:, 0:-1].tolist())

答案 2 :(得分:0)

import  numpy as np, pandas as pd

df = pd.DataFrame(data[0:7, 0:3].flatten().reshape(7,3),
       columns = ["a", "b", "c"])

            a           b     c
0   ksr-usconeng101     C   632.3
1   ksr-usconeng101     D   242.9
2   ksr-usconeng158     C   1044.5
3   ksr-usconeng158     D   2771.2
4   ksr-usconeng158     G   7.3
5   ksr-usconeng163     C   1597.0
6   ksr-usconeng163     D   1676.3