以下是我的代码,当前使用两个循环在由外部循环定义的num个迭代上处理输入df,并与内部循环内部生成的随机数字序列进行比较。
虽然当前的方法可以正确地给我输出,但我怀疑这可以用更好的方式完成,特别是对于外循环中的迭代次数超过几百万且df中的num列接近一百的情况
我想知道我是否可以尝试实现一两个技巧。
# Input df - index is same length as num iterations for inner loop defined below
# 'cumuluative' column value is used for comparison against random number inside inner loop
# 'units_A' is useful data captured from each iteration of inner loop that is aggregated after exiting inner loop
df_reference = pd.DataFrame(index=np.arange(1,11,1),data={'cumulative':np.arange(0.1,1.1,0.1),'units_A':np.arange(10,101,10)})
# Variable that determines num rows in output df
num_iterations_outer = 20
# Variable that determines number of iterations for inner loop operation
num_iterations_inner = 10
# Create an empty output df that will be updated at end
df_out = pd.DataFrame(columns=['cumulative','units_A'])
# Using np array for comparison inside loop instead of comparing against column which takes much longer
compare_against_arr = df_reference['cumulative'].values
# Create a list to store df's that will become rows of output df. This is done to store to list and concat once vs. concat each df at a time within loop
output_df_rows_list = []
for outer_iteration_num in np.arange(num_iterations_outer):
#current_cumulative_val = 1
# Rotation num is reset to 1 at the start of every outer interation
current_rotation_num = 1
# Create an empty list to store all rotation_num that are generated from inner loop iteration
rotations_list = []
for inner_iteration_num in np.arange(1,num_iterations_inner+1):
# Get a random number between (0.0,1.0]
comparator = np.random.random()
# Add the current rotation num to the list created before entering inner loop. Use the rotations list to get corresponding units_A after exiting inner loop
rotations_list.append(current_rotation_num)
# Compare random num 'comparator' to cumulative value corresponding to current rotation
if(comparator < compare_against_arr[current_rotation_num]):
# Reset rotation_num back to 1
current_rotation_num = 1
else:
# Increment rotation_num
current_rotation_num += 1
df_units_A_by_rotation = df_reference.reindex(rotations_list)
df_units_A_agg_outer_iter = pd.DataFrame(data=df_units_A_by_rotation.sum()).transpose()
output_df_rows_list.append(df_units_A_agg_outer_iter)
# Output df is created by concatenating all df stored in list that was updated in outer loop above
df_out = pd.concat(output_df_rows_list)
# Reset index so that it matches num_outer_iterations
df_out.index = np.arange(num_iterations_outer)
感谢您的宝贵时间,并感谢您的关注!