拆分熊猫列python

时间:2017-05-29 09:38:42

标签: python pandas split

我正在使用python 3.4并且我有一个包含以下内容的pandas dataframe列:

0    [0.3785766661167145, -0.449486643075943, -0.15...]
1    [0.204025000333786, -0.3685399889945984, 0.231...]
2    [0.684576690196991, -0.5823000073432922, 0.269...]
3    [-0.02300500124692917, -0.22056499123573303, 0...]
Name: comments, dtype: object

我希望将其拆分并将其转换为多行列:

   column1               column2              ...columnx
0  0.3785766661167145    -0.449486643075943     last element in the first list
1  0.204025000333786     -0.3685399889945984    last element in the 2nd list
2  0.684576690196991     -0.5823000073432922    last element in the 3rd list
3  -0.02300500124692917  -0.22056499123573303   last element in the 4th list
你能帮帮我吗? 提前致谢

4 个答案:

答案 0 :(得分:1)

如果数据为lists需要DataFrame构造函数,values + tolist https://developers.google.com/games/services/android/realtimeMultiplayer#invite_players_optioncomments列转换为numpy array

print (type(df.loc[0, 'comments']))
<class 'list'>

df1 = pd.DataFrame(df['comments'].values.tolist())
#rename columns if necessary
df1 = df1.rename(columns = lambda x: 'column' + str(x + 1))
print (df1)
    column1   column2  column3
0  0.378577 -0.449487   -0.150
1  0.204025 -0.368540    0.231
2  0.684577 -0.582300    0.269
3 -0.023005 -0.220565    0.000

答案 1 :(得分:0)

测试用例:

import pandas as pd
df = pd.DataFrame({
               'var1':['20, -20, -50','30, 20, -50','40','30'],
               'var2':['10','50','60','70']
              })
print(df)

    var1           var2
0   20, -20, -50    10
1   30, 20, -50     50
2   40              60
3   30              70

pd.concat([df[['var2']], df['var1'].str.split(',', expand=True)], axis=1)

enter image description here

答案 2 :(得分:0)

拥有DataFrame

df = pd.Series(
    {'comments': [list(np.random.randn(3).round()) for i in range(4)]
    }
)

其中df=

comments
0    [1.0, -2.0, 0.0]
1   [1.0, -3.0, -0.0]
2  [-0.0, -0.0, -1.0]
3  [-2.0, -2.0, -2.0]

调用

df2 = DataFrame(list(df['comments']))

你获得

     0    1    2
0  1.0 -2.0  0.0
1  1.0 -3.0 -0.0
2 -0.0 -0.0 -1.0
3 -2.0 -2.0 -2.0

答案 3 :(得分:0)

使用@ dDanny的示例Dataframe,

df = pd.DataFrame(
    {'comments': [list(np.random.randn(3).round()) for i in range(4)]
    })

您可以使用apply将包含列表的列转换为Dataframe。

 df.comments.apply(pd.Series)
Out[127]: 
     0    1    2
0 -2.0 -3.0 -1.0
1  1.0  0.0  1.0
2 -1.0 -1.0 -0.0
3  1.0  1.0  0.0