如何将系列对象转换为单个DataFrame?

时间:2018-04-04 05:54:21

标签: python pandas

我有系列对象'master'enter image description here

示例数据

   print (master.head(2).to_dict())
{4: 0, 5: array([['ANDAMAN & NICOBAR ISLANDS', 4, 0, 0.0, 0,
        datetime.date(2016, 11, 1)],
       ['ANDHRA PRADESH', 13161, 107, 0.008130081300813009, 0,
        datetime.date(2016, 11, 1)],
       ['ARUNACHAL PRADESH', 8, 0, 0.0, 0, datetime.date(2016, 11, 1)],
       ['ASSAM', 1317, 6, 0.004555808656036446, 0,
        datetime.date(2016, 11, 1)],
       ['Army Postal Service', 10, 0, 0.0, 0, datetime.date(2016, 11, 1)],
       ['BIHAR', 440, 4, 0.00909090909090909, 0, datetime.date(2016, 11, 1)],
       ['CHANDIGARH', 416, 9, 0.021634615384615384, 1,
        datetime.date(2016, 11, 1)],
       ['CHATTISGARH', 629, 5, 0.00794912559618442, 0,
        datetime.date(2016, 11, 1)],
       ['DAMAN & DIU', 19, 1, 0.05263157894736842, 1,
        datetime.date(2016, 11, 1)],
       ['DELHI', 6777, 60, 0.008853474988933156, 0,
        datetime.date(2016, 11, 1)],
       ['Delhi', 1, 0, 0.0, 0, datetime.date(2016, 11, 1)],
       ['GOA', 1546, 17, 0.010996119016817595, 1,
        datetime.date(2016, 11, 1)],
       ['GUJARAT', 8428, 102, 0.01210251542477456, 1,
        datetime.date(2016, 11, 1)],
       ['Gujarat', 190, 1, 0.005263157894736842, 0,
        datetime.date(2016, 11, 1)],
       ['HARYANA', 3741, 42, 0.011226944667201283, 1,
        datetime.date(2016, 11, 1)],
       ['HIMACHAL PRADESH', 801, 7, 0.008739076154806492, 0,
        datetime.date(2016, 11, 1)],
       ['JAMMU & KASHMIR', 852, 11, 0.012910798122065728, 1,
        datetime.date(2016, 11, 1)],
       ['JHARKHAND', 457, 2, 0.00437636761487965, 0,
        datetime.date(2016, 11, 1)],
       ['KARNATAKA', 22947, 210, 0.009151523074911754, 0,
        datetime.date(2016, 11, 1)],
       ['KERALA', 5868, 77, 0.013122017723244717, 1,
        datetime.date(2016, 11, 1)],
       ['Karnataka', 49, 0, 0.0, 0, datetime.date(2016, 11, 1)],
       ['LAKSHADWEEP', 13, 0, 0.0, 0, datetime.date(2016, 11, 1)],
       ['MADHYA PRADESH', 3031, 29, 0.009567799406136588, 0,
        datetime.date(2016, 11, 1)],
       ['MAHARASHTRA', 15027, 153, 0.010181672988620483, 1,
        datetime.date(2016, 11, 1)],
       ['MANIPUR', 16, 0, 0.0, 0, datetime.date(2016, 11, 1)],
       ['MEGHALAYA', 12, 0, 0.0, 0, datetime.date(2016, 11, 1)],
       ['MIZORAM', 4, 0, 0.0, 0, datetime.date(2016, 11, 1)],
       ['NAGALAND', 13, 0, 0.0, 0, datetime.date(2016, 11, 1)],
       ['ODISHA', 2674, 39, 0.014584891548242334, 1,
        datetime.date(2016, 11, 1)],
       ['PONDICHERRY', 245, 6, 0.024489795918367346, 1,
        datetime.date(2016, 11, 1)],
       ['PUNJAB', 3690, 37, 0.01002710027100271, 1,
        datetime.date(2016, 11, 1)],
       ['RAJASTHAN', 4544, 48, 0.01056338028169014, 1,
        datetime.date(2016, 11, 1)],
       ['SIKKIM', 17, 0, 0.0, 0, datetime.date(2016, 11, 1)],
       ['TAMIL NADU', 17912, 199, 0.011109870477891916, 1,
        datetime.date(2016, 11, 1)],
       ['TELANGANA', 9925, 90, 0.00906801007556675, 0,
        datetime.date(2016, 11, 1)],
       ['TRIPURA', 3, 0, 0.0, 0, datetime.date(2016, 11, 1)],
       ['UTTAR PRADESH', 7699, 78, 0.010131185868294583, 1,
        datetime.date(2016, 11, 1)],
       ['UTTARAKHAND', 2429, 18, 0.007410456978180321, 0,
        datetime.date(2016, 11, 1)],
       ['Uttar Pradesh', 174, 4, 0.022988505747126436, 1,
        datetime.date(2016, 11, 1)],
       ['WEST BENGAL', 3639, 17, 0.004671613080516625, 0,
        datetime.date(2016, 11, 1)]], dtype=object)}

我尝试使用以下查询转换为列表后将其转换为单个数据帧,但我无法

df1=pd.DataFrame()
master1=list(master)

转换为列表后,它看起来像enter image description here

for i in master1:
    i=pd.DataFrame(i)
    df1=df1.append(i)

我在执行上面的代码时遇到此错误

ValueError: DataFrame constructor not properly called!

我知道我收到此错误是因为列表中的值为0。我已经尝试摆脱这个0,因为我不需要这个,但我无法做到。

请帮忙。我正在使用spyder 3.2.3(Python 3.6)

1 个答案:

答案 0 :(得分:1)

您可以将list comprehension与仅过滤numpy数组一起使用,然后在必要时将列转换为正确的类型:

L = [pd.DataFrame(x) for x in master if isinstance(x,(np.ndarray, np.generic))]
df = pd.concat(L, ignore_index=True)
df.columns = ['A','B','C','D','E','DATE']

df[['B','C','E']] = df[['B','C','E']].astype(int)
df['D'] = df['D'].astype(float)
df['DATE'] = pd.to_datetime(df['DATE'])

print (df.head())
                           A      B    C         D  E       DATE
0  ANDAMAN & NICOBAR ISLANDS      4    0  0.000000  0 2016-11-01
1             ANDHRA PRADESH  13161  107  0.008130  0 2016-11-01
2          ARUNACHAL PRADESH      8    0  0.000000  0 2016-11-01
3                      ASSAM   1317    6  0.004556  0 2016-11-01
4        Army Postal Service     10    0  0.000000  0 2016-11-01

print (df.dtypes)
A               object
B                int32
C                int32
D              float64
E                int32
DATE    datetime64[ns]
dtype: object