更快地循环数据(维护新向量中计算值的顺序)

时间:2014-12-24 02:17:30

标签: r

是否有更快的方法在不使用两个循环的情况下进行以下计算,同时保持新计算向量的顺序?

# apply this function to a data.frame to create a new vector 

std = function(m)
{
    ret = (m - mean(m))/sd(m)
 }

#  创建数据框

x  = seq.Date(from = as.Date("2012-01-01"), to  = as.Date("2014-01-01"), by = "months") - 1
y  =  c("A","B","C","D","E")
z  = rnorm(500)

x1  = sample(x = x, size = 500, replace = TRUE, prob = NULL)
y1  = sample(x = y, size = 500, replace = TRUE, prob = NULL)
z1  = sample(x = z, size = 500, replace = TRUE, prob = NULL)

df = cbind.data.frame(x1,y1,z1)



vec = rep(NA, nrow(df))

# 运行计算

# first loop through the dates in df[,"x1"]
# apply the function std  to each set of values in df[,"y1"] for each date 

for(i in 1:length(x))   
{

idx = df[,"x1"] == x[i]

for(j in 1:length(y))
{
    idx2           =df[idx,"y1"] == y[j]
    vec[idx][idx2] = std(df[idx,"z1"][idx2])

} #end j loop


} # end i loop

我需要使用data.frame df

维护vec的顺序
cbind(df,vec)

1 个答案:

答案 0 :(得分:2)

这就是ave函数的用途。你可以做到

df <- cbind.data.frame(x1,y1,z1)
vec <- with(df, ave(z1,x1,y1,FUN=std))

有关详情,请参阅?ave帮助页