在python中优化循环

时间:2018-01-13 16:02:24

标签: python pandas loops

我是Python新手。我正在尝试执行以下循环,并想知道我是否以正确的方式执行此操作,或者是否有更好(更快)的方法来执行此操作。简而言之,我想计算变量y的一系列条件均值。关于x变量创建条件。例如,df中有y x1 x2 x3 x4。第一组条件是x1> x2和x1x2,x1

import pandas as pd
import numpy as np
import itertools

dates = pd.date_range('20130101', periods=100)

df = pd.DataFrame(np.random.randn(100,10), index=dates,                     
columns=list('ABCDEFGHIJ') )
df['y']=np.random.randn(100,1)

cols = list(df)
cols.insert(0, cols.pop(cols.index('y')))
df = df.loc[:, cols]

xlist = np.asarray(list(df.iloc[:,1:]))
xlist = pd.DataFrame(vlist, columns=['x'])

xcombo = pd.DataFrame(np.asarray(list(itertools.combinations(xlist['x'],     3))), columns=['x1','x2','x3'])
xcombo['stat'] = ""

for i, row in xcombo.iterrows():
    x1=(xcombo['x1'][i])
    x2=(xcombo['x2'][i])
    x3=(xcombo['x3'][i])
    # the following two lines (intends to) select subset of df meeting the         condition x1>x2 and x1<x3
    dfx = df[df[x1]>df[x2]]
    dfx = dfx[dfx[x1]<dfx[x3]] # df[df[x1]>df[x2] and df[x1]<df[x3]] doesn't work
    xcombo['stat'][i] = dfx['y'].mean() # store the mean value of y in the corresponding row        

1 个答案:

答案 0 :(得分:0)

您可以使用pandas dataframe的itertuples()方法。它比iteritems()或iterrows()快得多。