找到最高值,然后将较低的值替换为零

时间:2019-02-18 23:24:05

标签: r

我有一个数据框:  dframe <- structure(list(label = c("col1", "aim"), text1 = c(0, 0.00900990099009901), rwr = c(0, 0), ff = c(0, 0.0120792079207921), ff = c(0, 0.0204950495049505), dfdv = c(0, 0), wef = c(0, 0), cv = c(0, 0.588019801980198), vvf = c(0, 0), dsf = c(0, 0.0134653465346535), dfd = c(0, 0.0134653465346535), dfdsc = c(0, 0.0226732673267327), cxvd = c(0, 0.0226732673267327), icu = c(-0.4290625, 0.361831683168317), vcx = c(-0.0684375, 0.105693069306931), asd = c(0, 0.0864851485148515), dsa = c(-0.480625, 0.676287128712871), sd = c(0, 0), dfde = c(0, 0), dcfvdc = c(0, 0), ccdd = c(0, 0), fvcdc = c(0, 0.0169306930693069), vdf = c(0, 0), vdf = c(0, 0), fdv = c(0, 0), vdfv = c(0, 0), fvvr = c(-0.333125, 1.41455445544554), fev = c(0, 0), fverf = c(0, 0), vfd = c(0, 0.0361881188118812), fev = c(0, 0), wtfpl = c(0, 0), erfe = c(0, 0)), row.names = c(NA, -2L), class = "data.frame")

然后我要比较每列中的值并保持更大的值,即-0.4和0.3保持-0.4(绝对值),但在某些情况下不起作用。为什么会发生?

cbind(dframe[, 1], dframe[, -1] * apply(dframe[, -1], 1, function(x) x == max(x)))

预期输出:

dframe_ex <- structure(list(label = c("col1", "aim"), text1 = c(0, 0.00900990099009901), rwr = c(0, 0), ff = c(0, 0.0120792079207921), ff = c(0, 0.0204950495049505), dfdv = c(0, 0), wef = c(0, 0), cv = c(0, 0.588019801980198), vvf = c(0, 0), dsf = c(0, 0.0134653465346535), dfd = c(0, 0.0134653465346535), dfdsc = c(0, 0.0226732673267327), cxvd = c(0, 0.0226732673267327), icu = c(-0.4290625, 0), vcx = c(0, 0.105693069306931), asd = c(0, 0.0864851485148515), dsa = c(0, 0.676287128712871), sd = c(0, 0), dfde = c(0, 0), dcfvdc = c(0, 0), ccdd = c(0, 0), fvcdc = c(0, 0.0169306930693069), vdf = c(0, 0), vdf = c(0, 0), fdv = c(0, 0), vdfv = c(0, 0), fvvr = c(0, 1.41455445544554), fev = c(0, 0), fverf = c(0, 0), vfd = c(0, 0.0361881188118812), fev = c(0, 0), wtfpl = c(0, 0), erfe = c(0, 0)), row.names = c(NA, -2L), class = "data.frame")
> dframe_ex
  label         text1 rwr           ff           ff dfdv wef          cv vvf          dsf          dfd        dfdsc
1  col1 0.00000000000   0 0.0000000000 0.0000000000    0   0 0.000000000   0 0.0000000000 0.0000000000 0.0000000000
2   aim 0.00900990099   0 0.0120792079 0.0204950495    0   0 0.588019802   0 0.0134653465 0.0134653465 0.0226732673
          cxvd        icu         vcx          asd         dsa sd dfde dcfvdc ccdd        fvcdc vdf vdf fdv vdfv       fvvr
1 0.0000000000 -0.4290625 0.000000000 0.0000000000 0.000000000  0    0      0    0 0.0000000000   0   0   0    0 0.00000000
2 0.0226732673  0.0000000 0.105693069 0.0864851485 0.676287129  0    0      0    0 0.0169306931   0   0   0    0 1.41455446
  fev fverf          vfd fev wtfpl erfe
1   0     0 0.0000000000   0     0    0
2   0     0 0.0361881188   0     0    0

1 个答案:

答案 0 :(得分:1)

您的方法遇到的问题:

  1. 您说要对每个进行比较。 applyMAR = 1一起应用于。我们将其更改为MAR = 2以遍历列。

  2. 您在文本中提到“绝对值”,但不要在代码中使用abs()。我们将其添加。

  3. apply的输出将为n x 2,我们需要将其转置为2 x n。

这使我们在这里:

cbind(dframe[, 1], dframe[, -1] * t(apply(dframe[, -1], 2, function(x) abs(x) == max(abs(x)))))
#   dframe[, 1]       text1 rwr         ff       ff.1 dfdv wef cv vvf        dsf        dfd      dfdsc
# 1        col1 0.000000000   0 0.00000000 0.00000000    0   0  0   0 0.00000000 0.00000000 0.00000000
# 2         aim 0.009009901   0 0.01207921 0.02049505    0   0  0   0 0.01346535 0.01346535 0.02267327
#         cxvd        icu        vcx        asd        dsa sd dfde dcfvdc ccdd      fvcdc vdf vdf.1 fdv vdfv
# 1 0.00000000 -0.4290625 -0.0684375 0.00000000 -0.4806250  0    0      0    0 0.00000000   0     0   0    0
# 2 0.02267327  0.0000000  0.1056931 0.08648515  0.6762871  0    0      0    0 0.01693069   0     0   0    0
#        fvvr fev fverf        vfd fev.1 wtfpl erfe
# 1 -0.333125   0     0 0.00000000     0     0    0
# 2  1.414554   0     0 0.03618812     0     0    0

我可能会建议一种不同的方法,编写一个实用程序函数并将其应用于每一列:

non_max_to_zero = function(x) {
    x[abs(x) != max(abs(x))] = 0
    return(x)
}
dframe[, -1] = lapply(dframe[, -1], non_max_to_zero)

对于相同的结果。