如何在Python

时间:2016-02-20 15:08:52

标签: python dataframe

我有一个3d数组如下:

ThreeD_Arrays = np.random.randint(0, 1000, (5, 4, 3))

array([[[715, 226, 632],
        [305,  97, 534],
        [ 88, 592, 902],
        [172, 932, 263]],

       [[895, 837, 431],
        [649, 717,  39],
        [363, 121, 274],
        [334, 359, 816]],

       [[520, 692, 230],
        [452, 816, 887],
        [688, 509, 770],
        [290, 856, 584]],

       [[286, 358, 462],
        [831,  26, 332],
        [424, 178, 642],
        [955,  42, 938]], 

       [[ 44, 119, 757],
        [908, 937, 728],
        [809,  28, 442],
        [832, 220, 348]]])

现在我想把它放到像这样的DATAFRAME中:

the expected results is shown here

添加指示的Date列,列名ABC

如何进行此转型?谢谢!

3 个答案:

答案 0 :(得分:3)

您可以将3D数组转换为Pandas面板,然后将其展平为2D DataFrame(使用.to_frame()):

import numpy as np
import pandas as pd
np.random.seed(2016)

arr = np.random.randint(0, 1000, (5, 4, 3))
pan = pd.Panel(arr)
df = pan.swapaxes(0, 2).to_frame()
df.index = df.index.droplevel('minor')
df.index.name = 'Date'
df.index = df.index+1
df.columns = list('ABC')

产量

        A    B    C
Date               
1     875  702  266
1     940  180  971
1     254  649  353
1     824  677  745
...
4     675  488  939
4     382  238  225
4     923  926  633
4     664  639  616
4     770  274  378

或者,你可以重新整形数组以塑造(20, 3),像往常一样形成DataFrame,然后修复索引:

import numpy as np
import pandas as pd
np.random.seed(2016)

arr = np.random.randint(0, 1000, (5, 4, 3))
df = pd.DataFrame(arr.reshape(-1, 3), columns=list('ABC'))
df.index = np.repeat(np.arange(arr.shape[0]), arr.shape[1]) + 1
df.index.name = 'Date'
print(df)

产生相同的结果。

答案 1 :(得分:2)

根据this question的答案,我们可以使用MultiIndex。首先,创建MultiIndex和展平的DataFrame。

A = np.random.randint(0, 1000, (5, 4, 3))

names = ['x', 'y', 'z']
index = pd.MultiIndex.from_product([range(s)for s in A.shape], names=names)
df = pd.DataFrame({'A': A.flatten()}, index=index)['A']

现在我们可以重塑它,但我们喜欢:

df = df.unstack(level='x').swaplevel().sort_index()
df.columns = ['A', 'B', 'C']
df.index.names = ['DATE', 'i']

结果如下:

          A    B    C
DATE i           
0    0  715  226  632
     1  895  837  431
     2  520  692  230
     3  286  358  462
     4   44  119  757
1    0  305   97  534
     1  649  717   39
     2  452  816  887
     3  831   26  332
     4  908  937  728
2    0   88  592  902
     1  363  121  274
     2  688  509  770
     3  424  178  642
     4  809   28  442
3    0  172  932  263
     1  334  359  816
     2  290  856  584
     3  955   42  938
     4  832  220  348

答案 2 :(得分:0)

ThreeD_Arrays = np.random.randint(0, 1000, (5, 4, 3))
df = pd.DataFrame([list(l) for l in ThreeD_Arrays]).stack().apply(pd.Series).reset_index(1, drop=True)
df.index.name = 'Date'
df.columns = list('ABC')