假设我有以下数据框
from pandas import DataFrame
Cars = { 'value': [10, 31, 661, 1, 51, 61, 551],
'action1': [1, 1, 1, 1, 1, 1, 1],
'price1': [ 12,0, 15,3, 0, 12,0],
'action2': [2, 2, 2, 2, 2, 2, 2],
'price2': [ 0, 16, 19, 0, 1, 10,0],
'action3': [3, 3, 3, 3, 3, 3, 3],
'price3': [ 14, 36, 9, 0, 0, 0,0]
}
df = DataFrame(Cars,columns= ['value', 'action1', 'price1', 'action2', 'price2', 'action3', 'price3'])
print (df)
如何在3列中随机选择值(操作和价格)?结果,我想要一个看起来像这样的数据框?
RandCars = {'value': [10, 31, 661, 1, 51, 61, 551],
'action': [1, 3, 1, 3, 1, 2, 2],
'price': [ 12, 36, 15, 0, 3, 10, 0]
}
df2 = DataFrame(RandCars, columns = ['value','action', 'price'])
print(df2)
答案 0 :(得分:2)
使用:
#get columns names not starting by action or price
cols = df.columns[~df.columns.str.startswith(('action','price'))]
print (cols)
Index(['value'], dtype='object')
#convert filtered columns to 2 numpy arrays
arr1 = df.filter(regex='^action').values
arr2 = df.filter(regex='^price').values
#pandas 0.24+
#arr1 = df.filter(regex='^action').to_numpy()
#arr2 = df.filter(regex='^price').to_numpy()
i, c = arr1.shape
#create random positions of both DataFrames to new df
idx = np.random.choice(np.arange(c), i)
df3 = pd.DataFrame({'action': arr1[np.arange(len(df)), idx],
'price': arr2[np.arange(len(df)), idx]},
index=df.index)
print (df3)
action price
0 2 0
1 3 36
2 3 9
3 1 3
4 3 0
5 1 12
6 1 0
#add all another columns by join
df4 = df[cols].join(df3)
print (df4)
value action price
0 10 2 0
1 31 3 36
2 661 3 9
3 1 1 3
4 51 3 0
5 61 1 12
6 551 1 0