R范围引用具有不同尺寸而不会导致错误

时间:2015-06-25 00:26:12

标签: r dataframe

在使用范围引用时,我通常希望在“['']'中的操作与父对象的尺寸不匹配时看到错误或至少是警告消息,但是我刚刚发现我没有看到说警告和错误。是否有此设置或强制错误的方法?例如:

x = 1:5
y = 10:12
x[y>10]
y[x>2]

同样适用于数据框和其他R对象:

dat = data.frame(x=runif(100),y=1:100)
dat[sample(c(TRUE,FALSE),23),c(TRUE,FALSE)]

与父对象的尺寸匹配的引用的无声重复和截断是出乎意料的,多年来使用R,我以前从未注意到这一点。

我正在使用适用于Windows的R Console(64位)3.0.1(可以更新是,但我希望这不是原因)。

编辑:修复data.frame示例,因为data.frame不允许比列更多的列引用。谢谢zero323。

2 个答案:

答案 0 :(得分:2)

您可以修改`[.data.frame`函数,以便在使用不均匀划分行数的逻辑向量进行索引时发出警告:

`[.data.frame` <- function(x, i, j, drop = if (missing(i)) TRUE else length(cols) == 1) {
  if (!missing(i) && is.logical(i) && nrow(x) %% length(i) != 0) {
    warning("Indexing data frame with logical vector that doesn't evenly divide row count")
  }
  base::`[.data.frame`(x, i, j, drop)
}

这是一个150行虹膜数据集的演示,传递长度为11(应该引起警告)和15(不应该引起警告)的逻辑索引向量:

iris[c(rep(FALSE, 10), TRUE),]
#     Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 11           5.4         3.7          1.5         0.2     setosa
# 22           5.1         3.7          1.5         0.4     setosa
# 33           5.2         4.1          1.5         0.1     setosa
# 44           5.0         3.5          1.6         0.6     setosa
# 55           6.5         2.8          4.6         1.5 versicolor
# 66           6.7         3.1          4.4         1.4 versicolor
# 77           6.8         2.8          4.8         1.4 versicolor
# 88           6.3         2.3          4.4         1.3 versicolor
# 99           5.1         2.5          3.0         1.1 versicolor
# 110          7.2         3.6          6.1         2.5  virginica
# 121          6.9         3.2          5.7         2.3  virginica
# 132          7.9         3.8          6.4         2.0  virginica
# 143          5.8         2.7          5.1         1.9  virginica
# Warning message:
# In `[.data.frame`(iris, c(rep(FALSE, 10), TRUE), ) :
#   Indexing data frame with logical vector that doesn't evenly divide number of rows

iris[c(rep(FALSE, 14), TRUE),]
#     Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 15           5.8         4.0          1.2         0.2     setosa
# 30           4.7         3.2          1.6         0.2     setosa
# 45           5.1         3.8          1.9         0.4     setosa
# 60           5.2         2.7          3.9         1.4 versicolor
# 75           6.4         2.9          4.3         1.3 versicolor
# 90           5.5         2.5          4.0         1.3 versicolor
# 105          6.5         3.0          5.8         2.2  virginica
# 120          6.0         2.2          5.0         1.5  virginica
# 135          6.1         2.6          5.6         1.4  virginica
# 150          5.9         3.0          5.1         1.8  virginica

答案 1 :(得分:0)

扩展@josilber我已经为原子向量和矩阵子集编写了以下内容,以防其他人想要它:

`[` <- function(x, i) {
if(!missin

g(i) && is.logical(i) && (length(x) %% length(i) != 0 || length(i) > length(x))) {
        warning("Indexing atomic vector with logical vector that doesn't evenly divide row count")
    }
    base::`[`(x,i)
}

`[` <- function(x,i,j,...,drop=TRUE) {
    if (!missing(i) && is.logical(i) && nrow(x) %% length(i) != 0) {
        warning("Indexing matrix with logical vector that doesn't evenly divide row count")
    }
    if (!missing(j) && is.logical(j) && nrow(x) %% length(j) != 0) {
        warning("Indexing matrix with logical vector that doesn't evenly divide column count")
    }
    base::`[`(x,i,j,...,drop)
}

之后使用此修改测试我的原始示例现在会生成警告,其他操作按照正常情况执行:

> x = 
1:5
> y = 10:12
> x[y>10]
[1] 2 3 5
Warning message:
In x[y > 10] :
  Indexing atomic vector with logical vector that doesn't evenly divide row count
> y[x>2]
[1] 12 NA NA
Warning message:
In y[x > 2] :
  Indexing atomic vector with logical vector that doesn't evenly divide row count
> x[x>2]
[1] 3 4 5
> x[1:2]
[1] 1 2