在遗传算法的适应度函数中使用嵌套的for循环会使速度太慢

时间:2019-07-14 18:00:50

标签: r genetic-algorithm

我正在尝试通过“ GA”程序包使用遗传算法,但在创建适应度函数时遇到了问题,即使用GA来模拟我的数据并获得模型中最适合常数的值。

我的数据来自对汽车速度和其他参数的观察,所以假设我有一辆汽车,它进行了2次旅行,我想为其建立模型。 每次旅行都有多列(速度,对面汽车的速度差,两辆汽车之间的距离),因此我必须将每次旅行的第一行传递给适应度函数中的方程,然后该方程将生成速度,增量速度和范围的新结果,然后我必须使用新值并生成其他结果,然后将模拟距离与我在数据中观察到的旧范围进行比较,得出的结果是最小的由GA。

首先:这是我的数据。 https://drive.google.com/open?id=1923Jl6pDnQa_tGAluANUfIWCcyf85YVq

第二:这是我的健身功能和GA

Fitness_Function <- function(data, M_Acc, D_Speed, Beta, Com_Acc, Gap_J, D_Time){

        Trips_IDs <- sort(unique(data$FileName))
        # Trip=1;ROW=1
        Calibrated_DF <- data.frame()
        for (Trip in 1:2) {

                Trip_Data <- data%>%filter(FileName==Trips_IDs[Trip])
                attach(Trip_Data, warn.conflicts=F)

                for (ROW in 1:(nrow(Trip_Data)-1)) {
                        if (ROW==1) {
                                speed <- Filling_Speed[1]
                                Delta_V <-  Filling_DeltaVelocity[1]
                                Dist <- Filling_Range[1]
                                # M_Acc = 0.8418 ;D_Speed =29.2 ;Beta = 3.52
                                # Com_Acc = 0.8150 ;Gap_J = 1.554 ;D_Time = 0.878

                                Distance <- speed*D_Time - (speed*Delta_V)/(2*sqrt(M_Acc*Com_Acc))
                                if (Distance < 0 ) {
                                        Distance <- 0
                                }
                                D_Gap <- Gap_J + Distance
                                Acceleration <- M_Acc*(1-(speed/D_Speed)^Beta-(D_Gap/Dist)^2)
                        }else{
                                speed <- speed_C
                                Delta_V <- Delta_V_C
                                Dist <- Dist_c
                                Distance <- speed*D_Time - (speed*Delta_V)/(2*sqrt(M_Acc*Com_Acc))
                                if (is.na(Distance)) {

                                }
                                Distance = 0
                                if (Distance < 0 ) {
                                        Distance <- 0
                                }
                                D_Gap <- Gap_J + Distance
                                Acceleration <- M_Acc*(1-(speed/D_Speed)^Beta-(D_Gap/Dist)^2)

                        }
                        Lead_Veh_Speed_F <- Filling_Speed[ROW+1]+Filling_DeltaVelocity[ROW+1]
                        speed_C <- speed + Acceleration*0.1 
                        Delta_V_C <- Lead_Veh_Speed_F-speed_C
                        Dist_c <- Dist+(Delta_V_C+Delta_V)/2*0.1
                        Calibrated_DF <- rbind(Calibrated_DF,c(Dist_c,ROW+1,Trips_IDs[Trip],Trip_Data$Filling_Range[ROW+1]))
                }
                detach(Trip_Data)
        }
        colnames(Calibrated_DF) <- c("C_Distance","row","Trip","Actual_Distance")
        Calibrated_DF$Dif <- (Calibrated_DF$C_Distance-Calibrated_DF$Actual_Distance)^2

        RMSPE <- sqrt(sum(Calibrated_DF$Dif)/sum(Calibrated_DF$Actual_Distance^2))


        return(RMSPE)
        # return(Calibrated_DF)
}
GA_Test <- ga(type='real-valued', lower=c( 0.1 , 1 , 1 , 0.1 , 0.1 , 0.1 ),  
              upper=c( 5 , 40 , 40 , 5 , 10 , 5 ), popSize=300, maxiter=300,run = 100,
              keepBest=T, names = c("M_Acc", "D_Speed", "Beta", "Com_Acc", "Gap_J", "D_Time"),
              fitness = function(b) -Fitness_Function(data, b[1],b[2], b[3],b[4],b[5],b[6]))

我的问题是:代码非常大,而且即使进行一次迭代也很慢,我尝试使用dplyr而不是使用for循环,但是用dplyr不可能做到这一点,因为我必须计算距离,然后是加速度,然后是速度,然后为其他行再次计算它们,而我找不到用dplyr做到这一点的方法。 我将在此处发布使用Dplyr的Beta版代码,但由于我无法完成而无法完成。

所以请帮助。

data <- data%>%group_by(Driver,FileName)%>%
        mutate(Distance_Term = ifelse(row_number()==1,Speed_C*D_Time - (Speed_C*Delta_V_C)/(2*sqrt(M_Acc*Com_Acc)),0))
data <- data%>%mutate(Distance_Term = ifelse(Distance_Term < 0 , 0, Distance_Term))%>%
        mutate(D_Gap = Gap_J + Distance_Term,Acceleration_C = M_Acc*(1-(Speed_C/D_Speed)^Beta-(D_Gap/Distance)^2))

注意:旅程ID中的FileName列也符合我的PC的资格,所以问题不在我的PC上

1 个答案:

答案 0 :(得分:0)

我在accumulate2中使用purrr函数更改了for循环,因此它更快,更高效,我从这个问题Calculate variables using equations then use the generated values to generate new one得到了这个答案

Objective_Function <- function(data, M_Acc, D_Speed, Beta, Com_Acc, Gap_J, D_Time){

                myfun <- function(list, lcs,lcs2){
                        ds <- lcs - list[[1]]
                        Distance <- list[[1]]*D_Time - (list[[1]] * ds) / (2*sqrt(M_Acc*Com_Acc))
                        if (Distance < 0|is.na(Distance)) {Distance <- 0}
                        gap <-  Gap_J + Distance
                        acc <- M_Acc * (1 - (list[[1]] / D_Speed)^Beta - (gap / list[[2]])^2)
                        fcs_new <- list[[1]] + acc * 0.1
                        ds_new <- lcs2- fcs_new
                        di_new <- list[[2]]+(ds_new+ds)/2*0.1
                        return(list(Speed = fcs_new,Distance = di_new))

                } 

                Generated_Data <- data %>%group_by(Driver,FileName)%>%
                        mutate(Speed_Distance_Calibrated = accumulate2( .init = list(Filling_Speed[1],
                                                                                     Filling_Range[1]),.x =  Lead_Veh_Speed_F,.y = Lead_Veh_Speed_F2, myfun)[-1])%>%ungroup()
                Generated_Data <- Generated_Data %>% group_by(Driver,FileName)%>% 
                        mutate(Speed_Distance_Calibrated = append(list(list(Speed = Filling_Speed[1],Distance = Filling_Range[1])),Speed_Distance_Calibrated[-length(Speed_Distance_Calibrated)]))%>%ungroup()

                Dif <- map_df(Generated_Data$Speed_Distance_Calibrated, `[`, 2)
                Generated_Data <- Generated_Data %>% mutate(Dif_sq = (Dif$Distance - Generated_Data$Filling_Range)^2)
                RMSPE <- sqrt(sum(Generated_Data$Dif_sq)/sum(Generated_Data$Filling_Range^2))


                return(RMSPE)


        }
            GA_Test <- ga(type='real-valued', lower=c( 0.1 , 1 , 1 , 0.1 , 0.1 , 0.1 ),  
                          upper=c( 5 , 40 , 40 , 5 , 10 , 5 ), popSize=300, maxiter=300,run = 100,
                          keepBest=T, names = c("M_Acc", "D_Speed", "Beta", "Com_Acc", "Gap_J", "D_Time"),
                          fitness = function(b) -Objective_Function(data, b[1],b[2], b[3],b[4],b[5],b[6]),parallel = TRUE)
            Summary <- summary(GA_Test)