我试图将循环生成的单个值动态地附加到数据框。
global results_df
results_df=pd.DataFrame()
avg =109
std_dev = 12
# Loop through many simulations
for i in range(1000):
# Choose random inputs
rev_sim = np.random.normal(avg, std_dev, 1).round(0)#Rounding to 0 decimals
# Build the dataframe based on the inputs
df_res = pd.DataFrame(data={'REV_SIM': rev_sim})
results_df.append(df_res)
但是我的results_df为空。
答案 0 :(得分:1)
您没有将其分配回
for i in range(1000):
# Choose random inputs
rev_sim = np.random.normal(avg, std_dev, 1).round(0)#Rounding to 0 decimals
# Build the dataframe based on the inputs
df_res = pd.DataFrame(data={'REV_SIM': rev_sim})
results_df=results_df.append(df_res)# assign it back
答案 1 :(得分:0)
你为什么不尝试
import pandas as pd
import numpy as np
avg = 109
std_dev = 12
N = 1000
rev_sim = np.random.normal(avg, std_dev, N).round(0)
df = pd.DataFrame({'REV_SIM':rev_sim})
更新:
计时
温本的解决方案
%%timeit -n10
global results_df
results_df=pd.DataFrame()
for i in range(1000):
# Choose random inputs
rev_sim = np.random.normal(avg, std_dev, 1).round(0)#Rounding to 0 decimals
# Build the dataframe based on the inputs
df_res = pd.DataFrame(data={'REV_SIM': rev_sim})
results_df=results_df.append(df_res)# assign it back
1.08 s ± 36.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
我的解决方案
%%timeit -n10
N = 1000
rev_sim = np.random.normal(avg, std_dev, N).round(0)
result_df = pd.DataFrame({'REV_SIM':rev_sim})
748 µs ± 153 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
如果您确实需要通过循环生成条目,最好先定义一个数组,然后将其附加到您的df
%%timeit -n10
rev_sim = [np.random.normal(avg, std_dev, 1).round(0) for i in range(1000)]
result_df = pd.DataFrame({'REV_SIM':rev_sim})
6.55 ms ± 888 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
最新版本比我建议的版本慢8.64倍,而Wen-Ben的解决方案要慢1444倍。
熊猫可能会因循环而变得很慢。