如何用新值替换矩阵中的数据百分比

时间:2019-04-19 19:11:34

标签: r

如果矩阵值小于0.2,我需要将矩阵中5%的值替换为数字0.2(猜测),如果它们大于0.2,则不要理会它们

现在我的代码将所有小于0.2的值更改为0.2。

这将在更大的循环中最终出现在多次复制中,但是现在我正在尝试使其仅工作1次。

说明: gen.probs.2PLM是一个包含概率的矩阵。猜猜是我选择替代其他人的价值。 Perc是我要在矩阵中查看的百分比,如果它小于猜测值,则可以更改。

gen.probs.2PLM <- icc.2pl(gen.theta,a,b,N,TL)

perc<-0.05*N

guess<-0.2

gen.probs.2PLM[gen.probs.2PLM < guess] <- guess

我希望仅查看5%的值,如果它们低于0.2,则将其更改为0.2

gen.probs.2PLM是一个1000 * 45的矩阵

# dput(gen.probs.2PLM[1:20, 1:5])
structure(c(0.940298707380962, 0.848432615784556, 0.927423909103331, 
0.850853479678874, 0.857217846940203, 0.437981231531586, 0.876146933879543, 
0.735970164547576, 0.76296469377238, 0.640645338681073, 0.980212105400924, 
0.45164925578322, 0.890102475061895, 0.593094353657132, 0.837401449711248, 
0.867436194744775, 0.753637051722629, 0.64254277457268, 0.947783594375454, 
0.956791049998361, 0.966059152820211, 0.896715435704569, 0.957247808046098, 
0.898712615329071, 0.903924224222216, 0.474561641407715, 0.919080521405463, 
0.795919510255144, 0.821437921281395, 0.700141602452725, 0.990657455188518, 
0.490423165094245, 0.92990761183835, 0.649494291971471, 0.887513826127176, 
0.912171225584296, 0.812707696992244, 0.702126169775785, 0.971012049724468, 
0.976789027046465, 0.905046450670641, 0.81322870291296, 0.890539069545935, 
0.81539882951241, 0.821148949083641, 0.494459368656066, 0.838675666691869, 
0.719720365120414, 0.741166345529595, 0.646700411799437, 0.9578080044146, 
0.504938867664858, 0.852068230044858, 0.611124165649146, 0.803451686558428, 
0.830526582119632, 0.73370297276145, 0.648126933954648, 0.913887754151632, 
0.925022099584059, 0.875712266966582, 0.762677615526032, 0.857390771477182, 
0.765270669721981, 0.772159371696644, 0.418524844618452, 0.793318641931831, 
0.65437308255825, 0.678633290218262, 0.574232080921638, 0.943851827968259, 
0.428780249640693, 0.809653131485398, 0.536512513508941, 0.751041035436293, 
0.783450103818893, 0.6701523432789, 0.575762279897951, 0.886965071394186, 
0.901230746880145, 0.868181123535613, 0.688344765218149, 0.840795870494126, 
0.69262216320168, 0.703982665712434, 0.215843106547112, 0.738775789107177, 
0.513997187757334, 0.551803060188986, 0.397460216626274, 0.956693337996693, 
0.225901690507801, 0.765409027208693, 0.347791079152411, 0.669156131912199, 
0.72257632593578, 0.538474414984722, 0.399549159711904, 0.884405290470079, 
0.904200878248468), .Dim = c(20L, 5L))

1 个答案:

答案 0 :(得分:1)

这是一个函数,您可以将其应用于数值矩阵,以将低于某个阈值(例如.2)的5%的值替换为阈值:

replace_5pct <- function(d, threshold=.2){
  # get indices of cells below threshold, sample 5% of them 
  cells_below <- which(d < threshold)
  cells_to_modify <- sample(cells_below, size=.05*length(cells_below))
  # then replace values for sampled indices with threshold + return 
  d[cells_to_modify] <- threshold
  return(d)
}

这是一个如何使用它的示例(其中dat对应于您的矩阵):

dat <- matrix(round(runif(1000), 1), ncol=10)
dat_5pct_replaced <- replace_5pct(dat, threshold=.2)

您可以查看数据以确认结果,也可以查看如下统计信息:

mean(dat < .2)                # somewhere between .1 and .2 probably 
sum(dat != dat_5pct_replaced) # about 5% of mean(dat < .2)

p.s。:如果要泛化该功能,也可以抽象5%的替换项-然后可以替换例如低于某个阈值的值的10%,等等。如果您想花哨的话,也可以抽象“小于”,然后将比较函数作为参数添加到主函数中。

replace_func <- function(d, func, threshold, prop){
    cells <- which(func(d, threshold))
    cells_to_modify <- sample(cells, size=prop*length(cells))
    d[cells_to_modify] <- threshold
    return(d)
}

然后例如用.5替换高于.5的值的10%:

# (need to backtick infix functions like <, >, etc.) 
replace_func(dat, func=`>`, threshold=.5, prop=.1)