如何从R中的循环输出中创建第二个最高值

时间:2015-04-05 08:17:17

标签: r list for-loop nested-function

在这里开始R程序员,我想在数据框中找到一个人最兼容的人。兼容性基于算法,​​该算法将点分配给数据帧中的某些值。 我有一个名为kewl.d00dz的数据框,它看起来像这样:

name  dream.name birth.state birth.month birth.date major
1   stephen       butch          CO         oct         11  ELEC
2     clark     richard          VA         jan         19  BUAD
3   anthony          bo          NJ         mar         26  BUAD
4      jack     kordell          VA         jul         27  BUAD
5      eric      adrian          ND         jun         17  GEOG
6     tyler     anthony          VA         apr         12  CPSC
7    olivia    isabella          VA         may         29  MATH
8      brad      harvey          HI         aug         21  BUAD
9    hannah     charlie          VA         aug         28  PSYC
10     will      ronald          VA         may         11  BUAD
11     noor         ani          CA         apr         14  BUAD
12 victoria   elizabeth          VA         jan         11  MATH
13 morgan c      lauren          FL         jun         15  BUAD
14 morgan w   elizabeth          VA         feb         21  ARTS
15   helena      helena          VA         apr         26  BIOL
16    amber amber leigh          VA         dec          6  PSCI
17     ekta        kate          VA         apr         14  ARTH
18 caroline     georgia          DC         jun         20  BUAD
19     anna        abby          VA         sep         21  BUAD
20     nate       julio          VA         sep          5  ECON
21  jessica    jeanette          VA         oct          7  BUAD
22   shaina      skylar          VA         sep          2  BUAD
23     ruth        lucy          VA         jan          4  CPSC
24   sohyun    caroline      Seoul          nov         16  PSYC
25    aaron         don          VA         sep          1  ECON
26     alex        axel          VA         sep          6  BIOL
       cell num.bills num.states
1      none         5         41
2     apple         8         14
3     apple         4         14
4     apple        19         10
5     apple         6         19
6   samsung         1         10
7     apple         3          8
8     apple         1         18
9     apple         2         16
10    apple         5         20
11    apple         3         19
12    apple         5         17
13    apple         3         15
14    apple         4         24
15  android         0         18
16    apple         1         12
17    apple         1         19
18    apple         0         22
19    apple         0         27
20  samsung         4         32
21  samsung         5         11
22    apple         0         15
23    apple         7         30
24    apple        10         10
25 motorola         8         18
26      htc         3         20

我需要找到与我在函数中输入的任何人最兼容的人:

    source("compatibility.R")
find.most.compatible<-function(x){
  a<-which(kewl.d00dz$name==x)
  x<-as.list(kewl.d00dz[a,])
  pts<-list()
  namez<-list()
  for (i in 1:nrow(kewl.d00dz)){
    y<-as.list(kewl.d00dz[i,])
    pts[i]<-compatibility(x,y)
    namez[i]<-kewl.d00dz[i,"name"]
    names(pts)<-namez
  }
  n<-length(pts)
  (which(pts == sort(pts,partial=n-1)[n-1]))
}

我希望它将第二个最高值返回给我,因为如果它返回第一个值,那么该人将与自己最兼容。但它给了我这个错误消息:

    > find.most.compatible("stephen")
02727312231332325212224261723292219149302611312321
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 
  'x' must be atomic

这是我在前面提到的函数中调用的函数  我不想改变代码:

compatibility<-function(x,y){
  #start point bag
  com.points<-0

  #number of bills compatibility points
  com.points<-com.points +(10-abs(as.integer(x[["num.bills"]] - y[["num.bills"]]))) 

  #different number of states compatibility points
  diff.states<-abs(as.integer(x[["num.states"]]-y[["num.states"]]))
  cat(diff.states)
  if(diff.states<5){
    com.points<-com.points+5
  } else if(diff.states<10){
    com.points<com.points+3
  } else {
    com.points<-com.points
  }
  #birth month compatibility points 
  if(x[["birth.month"]]== "dec"||x[["birth.month"]]== "jan"||x[["birth.month"]]== "feb"){
    season1<-"winter"
  } else if(x[["birth.month"]]== "mar"|| x[["birth.month"]]== "apr" || x[["birth.month"]]== "may"){
    season1<-"spring"
  } else if(x[["birth.month"]]== "jun"||x[["birth.month"]]== "jul"||x[["birth.month"]]== "aug"){
    season1<-"summer"
  } else {
    season1<-"fall"
  }

  if(y[["birth.month"]]== "dec" || y[["birth.month"]]== "jan" || y[["birth.month"]] == "feb"){
    season2<-"winter"
  } else if(y[["birth.month"]]== "mar"||y[["birth.month"]]== "apr"||y[["birth.month"]]== "may"){
    season2<-"spring"
  } else if(y[["birth.month"]]== "jun"||y[["birth.month"]]== "jul"||y[["birth.month"]]== "aug"){
    season2<-"summer"
  } else {
    season2<-"fall"
  }
   if (x[["birth.month"]] == y[["birth.month"]]){
     com.points<-com.points + 3
   } else if(season1==season2){
     com.points<-com.points + 1
   } else {
     com.points<-com.points
   }
  #birth state compatibility points
  if (x[["birth.state"]]==y[["birth.state"]]){
    com.points<-com.points + 1
    } else {
      com.points<-com.points
    }
  #major compatibility points
  if (x[["major"]]==y[["major"]]){
    com.points<-com.points + 4
  } else {
    com.points<-com.points
  }

  #cellular provider compatibility points
  if(x[["cell"]] == y[["cell"]]){
    com.points<-com.points + 2
  } else {
    com.points<-com.points
  }    
return(com.points)
}

有人可以在不使用apply,subset等任何特殊功能的情况下对我的代码进行故障排除吗?

只允许使用which.max等。

1 个答案:

答案 0 :(得分:0)

我还没有尝试过你的整个代码,但是我可以看到你需要将你的循环修改成这样的东西 - 否则你的函数会在第一次迭代时返回。

我注释掉了名字(点)行b / c,一旦所有物品都进入,这也可以在你的循环之外。

pts <- list() # if you actually want a list. You could also do c() for a vector

for (i in 1:nrow(kewl.d00dz)) {
  y <- as.list(kewl.d00dz[i,])
  pts[i] <- compatibility(x,y)
  # names(pts) <- sprintf(kewl.d00dz[i,"name"],1:length(pts))
}

return(pts)