熊猫数据框在循环中添加值

时间:2019-04-01 16:48:20

标签: pandas

我试图将循环生成的单个值动态地附加到数据框。

global results_df
results_df=pd.DataFrame()

avg =109

std_dev = 12

# Loop through many simulations
for i in range(1000):    
    # Choose random inputs 
    rev_sim = np.random.normal(avg, std_dev, 1).round(0)#Rounding to 0 decimals

    # Build the dataframe based on the inputs
    df_res = pd.DataFrame(data={'REV_SIM': rev_sim})
    results_df.append(df_res)

但是我的results_df为空。

2 个答案:

答案 0 :(得分:1)

您没有将其分配回

for i in range(1000):    
    # Choose random inputs 
    rev_sim = np.random.normal(avg, std_dev, 1).round(0)#Rounding to 0 decimals

    # Build the dataframe based on the inputs
    df_res = pd.DataFrame(data={'REV_SIM': rev_sim})
    results_df=results_df.append(df_res)# assign it back 

答案 1 :(得分:0)

你为什么不尝试

import pandas as pd
import numpy as np

avg = 109
std_dev = 12

N  = 1000
rev_sim = np.random.normal(avg, std_dev, N).round(0)
df = pd.DataFrame({'REV_SIM':rev_sim})

更新:

计时

温本的解决方案

%%timeit -n10
global results_df
results_df=pd.DataFrame()

for i in range(1000):    
    # Choose random inputs 
    rev_sim = np.random.normal(avg, std_dev, 1).round(0)#Rounding to 0 decimals

    # Build the dataframe based on the inputs
    df_res = pd.DataFrame(data={'REV_SIM': rev_sim})
    results_df=results_df.append(df_res)# assign it back 

1.08 s ± 36.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

我的解决方案

%%timeit -n10
N  = 1000
rev_sim = np.random.normal(avg, std_dev, N).round(0)
result_df = pd.DataFrame({'REV_SIM':rev_sim})

748 µs ± 153 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

如果您确实需要通过循环生成条目,最好先定义一个数组,然后将其附加到您的df

%%timeit -n10
rev_sim = [np.random.normal(avg, std_dev, 1).round(0) for i in range(1000)]
result_df = pd.DataFrame({'REV_SIM':rev_sim})

6.55 ms ± 888 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

最新版本比我建议的版本慢8.64倍,而Wen-Ben的解决方案要慢1444倍。

熊猫可能会因循环而变得很慢。