来自array =>的Pandas Multiindex TypeError:不可用类型:'dict'

时间:2015-10-16 11:45:07

标签: python pandas dataframe multi-index

我正在尝试使用以下结构从数组创建数据框:

def create_from_arr():
    baby_array=pd.MultiIndex.from_tuples(df, names=['sessions', 'behaves'])
    return baby_array

使用此代码:

TypeError: unhashable type: 'dict'

我有以下错误,我无法理解:

list 
                   date_time      name value
 1    0 2015-05-22 05:37:59       Tom   129
      1 2015-05-22 05:37:59      Kate     0
      2 2015-05-22 05:37:59  GroupeId     0
 2    3 2015-05-26 05:56:59     Hence   129
      4 2015-05-26 05:56:59      Kate     0
      5 2015-05-26 05:56:59     Julie     0
 3    ......................    ......  ......

我想要的输出如下:

$encoded_data = "This is a huge string";
$filename = "tempMaxFile";//$meta_data["uri"];
$handle = fopen($_SERVER['DOCUMENT_ROOT'].$filename, "a+");
file_put_contents($_SERVER['DOCUMENT_ROOT'].$filename, $encoded_data);
$file = new UploadedFile($_SERVER['DOCUMENT_ROOT'].$filename,$filename);
var_dump($file->getClientSize());
die;

2 个答案:

答案 0 :(得分:2)

我仍然不确定你想用MultiIndex做什么,但这里有一种方法可以在多级数组中“压扁”你的字典并将数据正确地加载到数据框中:

以“列表”和“索引”更新为MultiIndex

In [100]: data = [[{'date_time': Timestamp('2015-05-22 05:37:59'),
   .....:         'name': 'Tom',
   .....:         'value': '129'},
   .....:        {'date_time': Timestamp('2015-05-22 05:37:59'),
   .....:         'name': 'Kate',
   .....:         'value': '0'},
   .....:        {'date_time': Timestamp('2015-05-22 05:37:59'),
   .....:         'name': 'GroupeId',
   .....:         'value': '0'}], [{'date_time': Timestamp('2015-05-22 05:37:59'),
   .....:         'name': 'Tom',
   .....:         'value': '129'},
   .....:        {'date_time': Timestamp('2015-05-22 05:37:59'),
   .....:         'name': 'Kate',
   .....:         'value': '0'},
   .....:        {'date_time': Timestamp('2015-05-22 05:37:59'),
   .....:         'name': 'GroupeId',
   .....:         'value': '0'}]]

In [101]: df = pd.DataFrame(columns=['list', 'date_time', 'name', 'value'])

In [102]: for idx, each in enumerate(data, 1):
   .....:     temp = pd.DataFrame(each)
   .....:     temp['list'] = idx  # manually assign "list" index
   .....:     df = df.append(temp, ignore_index=True)
   .....:     
In [103]: df = df.reset_index()

In [104]: df.set_index(['list', 'index'])
Out[104]: 
                     date_time      name value
list index                                    
1    0     2015-05-22 05:37:59       Tom   129
     1     2015-05-22 05:37:59      Kate     0
     2     2015-05-22 05:37:59  GroupeId     0
2    3     2015-05-22 05:37:59       Tom   129
     4     2015-05-22 05:37:59      Kate     0
     5     2015-05-22 05:37:59  GroupeId     0

答案 1 :(得分:0)

IIUC,让d成为数组的摘录:

d = [[{'date_time': '2015-05-22 05:37:59',
   'name': 'Tom',
   'value': '129'},
  {'date_time': '2015-05-22 05:37:59',
   'name': 'Kate',
   'value': '0'}]]

我会用以下内容提取数据框:

df = pd.DataFrame.from_dict(d[0])

返回:

             date_time  name value
0  2015-05-22 05:37:59   Tom   129
1  2015-05-22 05:37:59  Kate     0

希望有所帮助。