如何在pandas中将数组格式化为特定的数据帧格式?

时间:2018-04-17 15:42:26

标签: python pandas

我有一个如下所示的数组:

{'loc.1': array([  1,2,3,4,7,5,6]),'loc.2': array([  3,4,3,7,7,8,6]),'loc.3': array([  1,4,3,1,7,8,6]).....} 

在= pd.DataFrame(数组)之后,它看起来像这样:

loc.1    loc.2  loc.3
1        3      1
2        4      4
3        3      3
4        7      1
7        7      7
5        8      8
6        6      6

这是我想要的:

Col1.    Col.2 
loc.1    1,2,3,4,7,5,6
loc.2    3,4,3,7,7,8,6
loc.3    1,4,3,1,7,8,6 

我需要以这种特定的格式,因为我希望随后与另一个表连接。熊猫将是我的首选解决方案..

谢谢,如果这是一个愚蠢的问题,请道歉。

4 个答案:

答案 0 :(得分:3)

dictionary comprehension中的第一个需要加入值。

然后使用Series

a = pd.Series({k:','.join(v.astype(str)) for k, v in array.items()})
print (a)
loc.1    1,2,3,4,7,5,6
loc.2    3,4,3,7,7,8,6
loc.3    1,4,3,1,7,8,6
dtype: object

对于DataFrame

d = {k:','.join(v.astype(str)) for k, v in array.items()}
a = pd.DataFrame({'a': list(d.keys()), 'b': list(d.values())})

替代解决方案是创建元组:

L = [(k, ','.join(v.astype(str))) for k, v in array.items()]
a = pd.DataFrame(L, columns=['a','b'])
print (a)
       a              b
0  loc.1  1,2,3,4,7,5,6
1  loc.2  3,4,3,7,7,8,6
2  loc.3  1,4,3,1,7,8,6

如果需要列中的数组,请删除join并转换为string s:

L = [(k, v) for k, v in array.items()]
a = pd.DataFrame(L, columns=['a','b'])
print (a)
       a                      b
0  loc.1  [1, 2, 3, 4, 7, 5, 6]
1  loc.2  [3, 4, 3, 7, 7, 8, 6]
2  loc.3  [1, 4, 3, 1, 7, 8, 6]

答案 1 :(得分:1)

a = {'loc.1': [1,2,3,4,7,5,6],'loc.2': [3,4,3,7,7,8,6],'loc.3': [1,4,3,1,7,8,6]}
import pandas as pd
df = pd.DataFrame(a).transpose()
df['lists'] = df[[0,1,2,3,4,5,6]].values.tolist()
df = df['lists']

输出:

loc.1    [1, 2, 3, 4, 7, 5, 6]
loc.2    [3, 4, 3, 7, 7, 8, 6]
loc.3    [1, 4, 3, 1, 7, 8, 6]
Name: lists, dtype: object

答案 2 :(得分:1)

根据您需要的格式,有两种选择:

d = {'loc.1': np.array([  1,2,3,4,7,5,6]),
     'loc.2': np.array([  3,4,3,7,7,8,6]),
     'loc.3': np.array([  1,4,3,1,7,8,6])} 

res1 = pd.DataFrame([[x] for x in d.values()], index=d.keys())

#                            0
# loc.1  [1, 2, 3, 4, 7, 5, 6]
# loc.2  [3, 4, 3, 7, 7, 8, 6]
# loc.3  [1, 4, 3, 1, 7, 8, 6]

res2 = pd.DataFrame([', '.join(list(map(str, x))) for x in d.values()], index=d.keys())

#                          0
# loc.1  1, 2, 3, 4, 7, 5, 6
# loc.2  3, 4, 3, 7, 7, 8, 6
# loc.3  1, 4, 3, 1, 7, 8, 6

答案 3 :(得分:1)

您可以将stackgroupby

一起使用
df.stack().astype(str).groupby(level=1).apply(','.join)
Out[738]: 
loc.1    1,2,3,4,7,5,6
loc.2    3,4,3,7,7,8,6
loc.3    1,4,3,1,7,8,6
dtype: object