DataFrames重复组合

时间:2017-03-28 11:56:46

标签: python pandas

我有DataFrame df1和df2:

df1 = pd.DataFrame(['A1','A2']) 
    0
0  A1
1  A2
df2 = pd.DataFrame(pd.date_range('2016-01-01',periods = 2, freq = '1D'))
           0
0 2016-01-01
1 2016-01-02

我将如何获得此数据框?

    0    1
0  A1  2016-01-01
1  A1  2016-01-02
2  A2  2016-01-01
3  A2  2016-01-02

2 个答案:

答案 0 :(得分:4)

您可以使用itertools:

import itertools as it

pd.DataFrame(list(it.product(df1[0], df2[0])))
    0          1
0  A1 2016-01-01
1  A1 2016-01-02
2  A2 2016-01-01
3  A2 2016-01-02

itertools返回一个生成器,因此您需要在将其转换为DataFrame之前将其转换为列表

it.product在两个iterables对象之间进行所有组合,例如:

["".join(i) for i in it.product("ABC", "ABC")]
['AA', 'AB', 'AC', 'BA', 'BB', 'BC', 'CA', 'CB', 'CC']

答案 1 :(得分:2)

您必须使用pandas.concat扩展您的数据框,然后合并它。

import pandas as pd
# test data
df1 = pd.DataFrame(['A1','A2']) 
df2 = pd.DataFrame(pd.date_range('2016-01-01',periods = 2, freq = '1D'))

# expand dataframes to cover all varinats and get the same lengths
df3 = pd.concat([df1]*len(df2), ignore_index=True)
df4 = pd.concat([df2]*len(df2), ignore_index=True)

# final concat to merge dataframes
print (pd.concat([df3,df4],axis=1, ignore_index=True))

输出:

    0          1
0  A1 2016-01-01
1  A2 2016-01-02
2  A1 2016-01-01
3  A2 2016-01-02