col1 col2 col3
0 banana1 banana2 banana2
1 apple1 apple2 apple3
2 monkey1 monkey2 monkey3
3 iphone1 iphone2 iphone3
4 runner1 runner2 runner3
5 pig1 pig2 pig3
6 wifi1 wifi2 wifi3
7 girl1 girl2 girl3
8 boy1 boy2 boy3
9 couple1 couple2 couple3
如何在每一行中随机选择3个元素中的1个,并将其附加到新数据帧,我希望它循环N次然后继续并在新行上追加3个元素中的1个并循环N次?
import pandas as pd
data = {'col1': ['banana1', 'apple1', 'monkey1', 'iphone1', 'runner1', 'pig1', 'wifi1', 'girl1', 'boy1', 'couple1'],
'col2': ['banana2', 'apple2', 'monkey2', 'iphone2', 'runner2', 'pig2', 'wifi2', 'girl2', 'boy2', 'couple2'],
'col3': ['banana2', 'apple3', 'monkey3', 'iphone3', 'runner3', 'pig3', 'wifi3', 'girl3', 'boy3', 'couple3']}
df = pd.DataFrame(data, columns=['col1', 'col2' , 'col3'])
所以我想做的是为每一行随机选择item1
,item2
或item3
,并将其附加到新数据帧中的新行,当10'时选择了这个项目我希望它重新开始N次,然后转到新数据帧中的新行并循环N次。最终得到这样的东西(随机性):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
banana3 apple2 monkey1 iphone2 runner2 pig1 wifi2 girl3 boy1 couple1 banana1 apple2 monkey2 iphone3 runner3 pig3 wifi2 girl1 boy1 couple3
...........................................................................................................................................
...........................................................................................................................................
...........................................................................................................................................
banana1 apple2 monkey2 iphone3 runner1 pig2 wifi3 girl1 boy3 couple2 banana2 apple1 monkey2 iphone2 runner2 pig1 wifi2 girl3 boy1 couple2
在此输出中,我在每行上选择1/3的循环将其循环2次到新数据帧中的N行。
我喜欢通过一个函数来完成它,它会根据n和N给出我想要的输出。
new_df = []
def rand_element_selection(n,N):
for row in df.iterrows:
element_holder = df.sample(1)
new_df.append(placeholder)
上面没有定义 n
和N
因为我正在努力向前发展..
答案 0 :(得分:1)
IIUC您可以致电sample
上的axis=1
并转置:
In [172]:
n=3
N=2
df_list=[]
for i in range(n):
df_list.append(pd.concat([df.sample(1, axis=1).T.reset_index(drop=True) for j in range(N)], axis=1, ignore_index=True))
pd.concat(df_list, ignore_index=True)
Out[172]:
0 1 2 3 4 5 6 7 8 \
0 banana2 apple3 monkey3 iphone3 runner3 pig3 wifi3 girl3 boy3
1 banana2 apple2 monkey2 iphone2 runner2 pig2 wifi2 girl2 boy2
2 banana2 apple2 monkey2 iphone2 runner2 pig2 wifi2 girl2 boy2
9 10 11 12 13 14 15 16 17 \
0 couple3 banana2 apple3 monkey3 iphone3 runner3 pig3 wifi3 girl3
1 couple2 banana1 apple1 monkey1 iphone1 runner1 pig1 wifi1 girl1
2 couple2 banana2 apple3 monkey3 iphone3 runner3 pig3 wifi3 girl3
18 19
0 boy3 couple3
1 boy1 couple1
2 boy3 couple3
答案 1 :(得分:0)
连接主要来自EdChum's answer:
n=3
N=2
df_list=[]
for i in range(n):
df_list.append(pd.concat([df.apply(np.random.choice, axis=1) for i in range(N)], ignore_index=True))
new_df = pd.concat(df_list, axis=1, ignore_index=True).T