在不同的列中随机选择一个值?

时间:2019-06-16 13:19:48

标签: python pandas dataframe

假设我有以下数据框

    from pandas import DataFrame

   Cars = { 'value': [10, 31, 661, 1, 51, 61, 551],
         'action1': [1, 1, 1, 1, 1, 1, 1],
        'price1': [ 12,0, 15,3, 0, 12,0], 
         'action2': [2, 2, 2, 2, 2, 2, 2],
        'price2': [ 0, 16, 19, 0, 1, 10,0], 
         'action3': [3, 3, 3, 3, 3, 3, 3],
        'price3': [ 14, 36, 9, 0, 0, 0,0]
        }
df = DataFrame(Cars,columns= ['value', 'action1', 'price1', 'action2', 'price2', 'action3', 'price3'])
print (df)

如何在3列中随机选择值(操作和价格)?结果,我想要一个看起来像这样的数据框?

 RandCars = {'value': [10, 31, 661, 1, 51, 61, 551],
            'action': [1, 3, 1, 3, 1, 2, 2],
            'price': [ 12, 36, 15, 0, 3, 10, 0]
          }

df2 = DataFrame(RandCars, columns = ['value','action', 'price'])
print(df2)

1 个答案:

答案 0 :(得分:2)

使用:

#get columns names not starting by action or price
cols = df.columns[~df.columns.str.startswith(('action','price'))]
print (cols)
Index(['value'], dtype='object')

#convert filtered columns to 2 numpy arrays
arr1 = df.filter(regex='^action').values
arr2 = df.filter(regex='^price').values
#pandas 0.24+
#arr1 = df.filter(regex='^action').to_numpy()
#arr2 = df.filter(regex='^price').to_numpy()
i, c  = arr1.shape

#create random positions of both DataFrames to new df
idx = np.random.choice(np.arange(c), i)
df3 = pd.DataFrame({'action': arr1[np.arange(len(df)), idx],
                    'price':  arr2[np.arange(len(df)), idx]}, 
                    index=df.index)
print (df3)
   action  price
0       2      0
1       3     36
2       3      9
3       1      3
4       3      0
5       1     12
6       1      0

#add all another columns by join
df4 = df[cols].join(df3)
print (df4)
   value  action  price
0     10       2      0
1     31       3     36
2    661       3      9
3      1       1      3
4     51       3      0
5     61       1     12
6    551       1      0