R - 优化目标函数(做大量的矩阵操作)

时间:2010-10-05 16:50:07

标签: optimization r

这是遗传优化的成本函数(最重要的部分),因此它需要非常快。现在,即使对于玩具问题规模来说,它也很慢。我很确定在这段代码中有更快的方法来完成许多操作,但我在调优R代码方面并不擅长。任何人都可以提出任何建议吗?

FbFtFi是标量常量。 tpat是一个大的,恒定的二维矩阵。 jpat是一个正在优化的较小矩阵。 nrow(tpat) == nrow(jpat)ncol(tpat) %% ncol(jpat) == 0是不变量。 tpatjpat中的所有条目都是[0,1]中的实数。

# Toy jamming model for genetic optimization.
#
# A jamming pattern is a vector of real numbers \in [0,1], interpreted
# as a matrix with subcarrier frequency bands in the rows and time
# slots in the columns.  Every time slot, the jammer transmits
# Gaussian noise centered on the subcarrier frequency (todo: should
# that be Gaussian baseband noise *modulated* onto the subcarrier
# frequency?) with intensity equal to the number in the appropriate
# matrix cell (0 = off, 1 = maximum power).
#
# A transmission pattern is similar, but there are many more time
# slots; the jamming pattern is repeated horizontally to cover the
# complete transmission pattern.  (todo: implement jamming duty cycles.)
#
# The transmitter is required to transmit complete packets of some
# fixed length equal to several time slots, and it uses a fixed
# intensity Itr < 1 for each packet (we assume that the jammer is in
# between the transmitter and receiver, so its effective power at the
# receiver is higher).

Itr <- 0.75;
Fb  <- 0.1;
Ft  <- 0.1;
Fi  <- 0.5;

Nb  <- 100;
Sj  <- 30;
St  <- Sj * 20;

# success metric
pkt.matrix <- function(tpat) {
  # Find all the packets in tpat.  A packet is a contiguous sequence
  # of timeslots during which the transmitter was active on at least
  # one frequency band.  Returns a logical matrix with
  # nrow=(number of packets), ncol=(total timeslots), in which each
  # row will select one packet from the original matrix.
  runs <- rle(ifelse(apply(tpat, 2, sum) > 0, TRUE, FALSE));
  pkt  <- matrix(FALSE, nrow=sum(runs$values == TRUE),
                 ncol=sum(runs$lengths));
  i <- 1
  j <- 1
  for (r in 1:length(runs$lengths)) {
    if (runs$values[r]) {
      pkt[i, j:(runs$lengths[r]+j-1)] <- TRUE;
      i <- i + 1;
    }
    j <- j + runs$lengths[r];
  }
  return(pkt);
}

success.metric <- function(jpat, tpat) {
  if (ncol(tpat) %% ncol(jpat)) error("non-conformable arrays");
  if (ncol(tpat) > ncol(jpat))
    # there must be a better way to do this...
    jpat <- do.call(cbind, rep(alist(jpat), ncol(tpat)/ncol(jpat)));

  pktm <- pkt.matrix(tpat);
  pkts <- nrow(pktm);
  jammed <- 0;
  for (i in 1:pkts) {
    pkt <- tpat[,pktm[i,]];
    jam <- jpat[,pktm[i,]];

    # jamming on a channel not being used by the transmitter at the time
    # is totally ineffective
    jam[pkt==0] <- 0;

    # at least Ft of the time slots used by `pkt` must have had at least
    # one channel jammed
    if (sum(apply(jam, 2, sum) > 0) < Ft * ncol(pkt)) next;

    # at least Fb of the time slots used by `pkt` must have been jammed
    # at least once
    if (sum(apply(jam, 1, sum) > 0) < Fb * nrow(pkt)) next;

    # the total intensity produced by the jammer must be at least Fi the
    # total intensity produced by the source
    if (sum(jam) < Fi * sum(pkt)) next;

    jammed <- jammed + 1;
  }
  return((pkts - jammed) * 100 / pkts);
}

# some `tpat` examples; `jpat` is generated by genoud()
## saturation transmission: on for 19, off for 1
sat.base <- c(rep(Itr, 19), 0);
### single constant subcarrier
sat.scs <- matrix(0, nrow=Nb, ncol=St);
sat.scs[Nb/2,] <- sat.base;

### FHSS with an incredibly foolish hopping pattern
sat.fhss <- matrix(0, nrow=Nb, ncol=St);
# razzum frazzum 1-based arrays
sat.fhss[((col(sat.fhss) - 1) %% nrow(sat.fhss)) + 1 == 
         row(sat.fhss)] <- sat.base;

3 个答案:

答案 0 :(得分:2)

很多循环,大量扫描数组,很少有统计函数......我会用C语言重写它。

保持慢速R版本以进行检查,并在C中重写。确保您的R和C为测试数据集提供相同的值。

哦,但首先介绍一切,以确保它的这一点很慢 - 它看起来肯定是一个主要的候选人。

答案 1 :(得分:2)

有一件事有助于取代:

runs <- rle(ifelse(apply(tpat, 2, sum) > 0, TRUE, FALSE))  # replace this
runs <- rle(colSums(tpat) > 0)  # with this

并且通常将apply(foo, 2, sum)替换为colSums(foo),将apply(foo, 1, sum)替换为rowSums(foo)

编辑:这是pkt.matrix的更新版本。没什么了不起的,但它的速度要快得多。

pkt.matrix <- function(tpat) {
  runs <- rle(colSums(tpat) > 0);
  pkt  <- matrix(FALSE, nrow=sum(runs$values),
                 ncol=sum(runs$lengths));

  endpts <- cumsum(runs$lengths)[runs$values]
  begpts <- endpts-runs$lengths[runs$values]+1

  for(i in 1:NROW(pkt)) {
    #pkt[i,seq(begpts[i],endpts[i])] <- TRUE
    pkt[i,begpts[i]:endpts[i]] <- TRUE  # eyjo's suggestion
  }

  return(pkt);
}

> # Times on my machine:
> # Original
> system.time( for(i in 1:1e4) pktm <- pkt.matrix(sat.fhss) )
   user  system elapsed 
  68.21    0.23   68.50
> # Updated
> system.time( for(i in 1:1e4) pktm <- pkt.matrix(sat.fhss) )
   user  system elapsed 
   4.28    0.00    4.28

答案 2 :(得分:1)

也许自己发布一个关于你的pkt.matrix函数的问题(看起来好像R代码)。这可能是您可以提供玩具样本数据并给出简单描述的内容。事实上,就我所知,如果这是一个清单,你会更好。你真的希望它在每一行都是对称的吗?如果数据包是不规则的,那么只需创建一个数据包列表。它更容易,并且工作得更快。

果酱只是一个载体吗?如果是这样,那么“sum(apply(jam,2,sum)&gt; 0)”就是废话。它应该是总和(果酱)。