我有一个类似3项的数据集,例如[1,2,3]
我希望找到3个重复的产品,然后将它们分成3个这样的数据集(实际应该是垂直的):
[1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3]
[1,1,1,2,2,2,3,3,3,1,1,1,2,2,2,3,3,3,1,1,1,2,2,2,3,3,3]
[1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3]
我注意到在python中我可以使用iteration.product来查找产品:
data_prod=itertools.product(data,repeat=3)
现在我的问题是如何将结果的每一列(数据类型为itertools.product)转换为3个新数据集,如上例所示?
答案 0 :(得分:2)
使用zip(*..)
将列转换为行:
dataset1, dataset2, dataset3 = zip(*itertools.product(data,repeat=3))
演示:
>>> zip(*itertools.product(data,repeat=3))
[(1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3), (1, 1, 1, 2, 2, 2, 3, 3, 3, 1, 1, 1, 2, 2, 2, 3, 3, 3, 1, 1, 1, 2, 2, 2, 3, 3, 3), (1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3)]
>>> dataset1, dataset2, dataset3 = zip(*itertools.product(data,repeat=3))
>>> dataset1
(1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3)
>>> dataset2
(1, 1, 1, 2, 2, 2, 3, 3, 3, 1, 1, 1, 2, 2, 2, 3, 3, 3, 1, 1, 1, 2, 2, 2, 3, 3, 3)
>>> dataset3
(1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3)
答案 1 :(得分:0)
另一种方法,出于显示目的,仍然使用itertools.product:
import itertools
import pandas as pd
cols=['series1', 'series2', 'series3']
originDataset = [1,2,3]
data_prod = lambda x: list(itertools.product(x, repeat=3))
df1 = pd.DataFrame(originDataset, columns=['OriginalDataSet'])
df2 = pd.DataFrame(data_prod(originDataset), columns=cols)
print df1
print '-'*80
print df2
print '-'*80
series1, series2, series3 = df2.T.values
print series1
print series2
print series3
输出:
OriginalDataSet
0 1
1 2
2 3
--------------------------------------------------------------------------------
series1 series2 series3
0 1 1 1
1 1 1 2
2 1 1 3
3 1 2 1
4 1 2 2
5 1 2 3
6 1 3 1
7 1 3 2
8 1 3 3
9 2 1 1
10 2 1 2
11 2 1 3
12 2 2 1
13 2 2 2
14 2 2 3
15 2 3 1
16 2 3 2
17 2 3 3
18 3 1 1
19 3 1 2
20 3 1 3
21 3 2 1
22 3 2 2
23 3 2 3
24 3 3 1
25 3 3 2
26 3 3 3
--------------------------------------------------------------------------------
[1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3]
[1 1 1 2 2 2 3 3 3 1 1 1 2 2 2 3 3 3 1 1 1 2 2 2 3 3 3]
[1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3]
我希望同时了解如何使用Pandas