pandas:将列的列聚合到一个列表中

时间:2017-10-23 19:32:43

标签: python-3.x pandas dataframe aggregate

我有以下数据框my_df

name         numbers
----------------------
A             [4,6]
B             [3,7,1,3]
C             [2,5]
D             [1,2,3]

我想将所有数字组合到一个新列表中,因此输出应为:

 new_numbers
---------------
[4,6,3,7,1,3,2,5,1,2,3]

这是我的代码:

def combine_list(my_lists):
    new_list = []
    for x in my_lists:
        new_list.append(x)

    return new_list

new_df = my_df.agg({'numbers': combine_list})

new_df仍然与原始版本相同:

              numbers
----------------------
0             [4,6]
1             [3,7,1,3]
2             [2,5]
3             [1,2,3]

我做错了什么?如何使new_df像:

 new_numbers
---------------
[4,6,3,7,1,3,2,5,1,2,3]

谢谢!

4 个答案:

答案 0 :(得分:4)

您需要flatten个值,然后按构造函数创建新的Dataframe

flatten = [item for sublist in df['numbers'].values.tolist() for item in sublist]

或者:

flatten = np.concatenate(df['numbers'].values).tolist()

或者:

from  itertools import chain

flatten = list(chain.from_iterable(df['numbers'].values.tolist()))
df1 = pd.DataFrame({'numbers':[flatten]})
print (df1)
                             numbers
0  [4, 6, 3, 7, 1, 3, 2, 5, 1, 2, 3]

时间here

答案 1 :(得分:1)

您可以使用df ['数字']。sum()返回组合列表以创建新数据框

new_df = pd.DataFrame({'new_numbers': [df['numbers'].sum()]})

    new_numbers
0   [4, 6, 3, 7, 1, 3, 2, 5, 1, 2, 3]

答案 2 :(得分:0)

这应该做:

newdf = pd.DataFrame({'numbers':[[x for i in mydf['numbers'] for x in i]]})

答案 3 :(得分:0)

选中此pandas groupby and join lists

您正在寻找的是

my_df = my_df.groupby(['name'])。agg(sum)