熊猫的循环矢量化

时间:2018-04-27 16:51:56

标签: pandas numpy

我一直在尝试对以下内容进行矢量化而没有这样的运气:

考虑两个数据帧。一个是日期列表:

cols = ['col1', 'col2']

index = pd.date_range('1/1/15','8/31/18')

df = pd.DataFrame(columns = cols )

我目前正在做的是循环df并使用我的主(大)数据框df_main

for x in range(len(index)):
    temp_arr = []

    active = len(df_main[(df_main.n_date <= index[x])]

    temp_arr = [index[x],active]

    df= df.append(pd.Series(temp_arr,index=cols) ,ignore_index=True)

有没有办法对上面进行矢量化?

1 个答案:

答案 0 :(得分:0)

如下所示

#initializing
mycols = ['col1', 'col2']
myindex = pd.date_range('1/1/15','8/31/18')
mydf = pd.DataFrame(columns = mycols )

#create df_main (that has each of myindex's dates minus 10 days)
df_main = pd.DataFrame(data=myindex-pd.Timedelta(days=10), columns=['n_date'])

#wrap a dataframe around a list comprehension
mydf = pd.DataFrame([[x, len(df_main[df_main['n_date'] <= x])] for x in myindex])