适用于外部条件的功能

时间:2016-01-29 10:52:45

标签: r if-statement conditional apply

我有一个带随机值的数据框df

df <- data.frame(x1=runif(20,1,200),x2=runif(20,1,18),x3=runif(20,1,7),x4=runif(20,1,3),x5=runif(20,1,25),x6=runif(20,1,220),x7=runif(20,1,10),x8=runif(20,1,8),x9=runif(20,1,20),x10=runif(20,1,32))
df

           x1        x2       x3       x4        x5         x6       x7       x8        x9       x10
1   43.942462 14.983885 4.267664 2.591210 19.650770  95.710478 8.830253 7.089017  5.341859  3.574852
2  185.965077  8.099796 3.592361 1.953196  8.837645 111.846707 8.180938 3.355258 13.889081 26.878697
3   83.532083  2.782204 3.160955 1.892041 23.216698  80.521986 3.864614 6.799805 17.493065  9.246177
4   48.416861 17.019713 5.182366 2.501890  8.108828 219.419766 4.687034 6.785789  2.525997  7.145447
5   66.766778 11.716819 1.649946 2.136352  2.957554 126.164722 9.980739 1.919323 16.556541  5.447096
6   78.305312 12.148354 6.408544 2.644811 10.362618  53.112153 1.092853 1.360766  6.693875 17.108564
7   64.995759 13.385556 3.375907 1.923173 19.732286 219.780082 4.074889 4.609356  7.098822 25.412262
8  196.463100 17.491693 2.317492 2.573539 24.350820  36.696244 6.277854 6.247473  5.535765 12.121822
9   48.467431 11.659182 4.324854 1.380067 15.269617 102.453557 2.724937 1.481521 14.916894  3.451188
10 134.913063  8.927522 2.637946 1.526043 17.956797  49.671752 5.014152 4.737910  4.241197 28.916885
11 190.841615  2.639374 5.038702 2.806088 15.127840   8.841983 2.155842 7.589245 13.799412 28.025792
12  46.963826 11.212431 4.944327 2.937039 16.410549  25.048928 6.330826 5.006221  2.986566 17.005088
13  97.258821 17.847892 6.202023 2.228292 19.804482 159.922462 2.587568 4.175234  5.360039 15.812061
14 123.439971 15.415940 5.785273 2.075161 11.496406  12.449913 6.484951 7.911373 11.578242 22.398292
15   4.225315 11.775122 6.908108 2.980960 22.768381 109.853774 2.535843 7.293656 13.290552 29.302949
16  49.927327  4.086780 3.941200 1.129892 18.200466 164.281496 6.881178 6.199219  4.091858 29.963647
17 105.716881 12.421335 6.527660 2.767754 22.055987 208.188895 8.125112 7.702927  3.027778 20.080756
18 195.205248  5.749007 6.204989 1.815563  3.875226 200.608675 1.500572 7.116924  1.608354 13.292293
19  27.564433 16.788191 1.648707 2.360290 22.539064 192.914543 1.327605 6.096303  7.105979 22.650040
20 122.620812 11.475314 5.588179 1.884028  3.692936 200.056348 3.248232 1.562624 18.998767 29.424066

和一个向量ind,其中某些值对应df中的每一列。 ind中的值是规范化过程的指标。

ind
x1     x2     x3     x4     x5     x6     x7     x8     x9    x10 
0.800  1.000  0.400  0.010  6.000  0.100  0.180  0.006 10.000  1.000

现在,如果df中的对应值等于或高于某个阈值,我需要编写一个代码,将所需函数应用于ind列中的每个值。

例如,如果此阈值为0.8,则df中受影响的列将为x1x2x5x9x10

我尝试了类似apply(df,2,function(x)...的内容,但我没有足够的技巧来插入明显需要的ifelse

3 个答案:

答案 0 :(得分:0)

apply(df[,ind>=threshold],2,function(x) {...

应该做的。

答案 1 :(得分:0)

只需对数据框进行子集化,即可选择符合阈值条件setConnected()

的列
df[,ind >= threshold]

[编辑以避免在尺寸2上使用申请]

答案 2 :(得分:0)

我想你可以采用这样的方法:

df <- data.frame(x1=runif(20,1,200),x2=runif(20,1,18),x3=runif(20,1,7),x4=runif(20,1,3),x5=runif(20,1,25),x6=runif(20,1,220),x7=runif(20,1,10),x8=runif(20,1,8),x9=runif(20,1,20),x10=runif(20,1,32))
ind <- c(0.8,1.0,0.4,0.01,6.0,0.1,0.18, 0.006, 10.0,  1.0)
threshold <- 0.8

m<- ind>=0.8
index<- m %in% c(TRUE)
df2<-df[,index]
df3<-apply(df,2,scale)

规范化的功能可以自行选择。