Question

我有一个150个数字的数据集，我从中采样100个。如何识别（放入新矩阵）剩余的50？

X <- runif(150)
Combined <- sample(X, 100)

Answer 1

将您的样本创建为单独的向量：

using <- sample(1:150, 100)

Entires <- All.Entries[using]
Non.Entries <- All.Entries[-using]

Answer 2

根据您的评论进行更新。

如果Combined是X的子集，要找到X中不在Combined的那些元素，您可以使用：

    X[ !(X %in% Combined) ]

当元素位于X %in% Combined)时，

X将为您提供与值TRUE的{{1}}大小相同的逻辑向量，当元素为Combined时不是。

作为课程说明：该逻辑向量可以用作指标。 FALSE X[ X %in% Combined ]会X X Combined X[ !(X %in% Combined) ]。

由于你正在寻求相反的否定逻辑向量X以获得所有X，Combined不在X。

<小时/>

如果X[ !(names(X) %in% names(Combined)) ] # or if sampling by rows X[ !(rownames(X) %in% rownames(Combined)) ]包含重复项，则您可以根据名称进行过滤（当然，假设是唯一名称）

您可以轻松地为names(X) <- 1:length(X) # or for multi-dimensional rownames(X) <- 1:nrow(X)

指定名称

?"%in%"  # note the quotes
?which
?match

另请参阅

的帮助文档

mat[-indices,]

<小时/> 或者，您可以对索引进行采样，使用负号如下

    # Create a sample matrix of 150 rows, 3 columns
    mat <- matrix(rnorm(450), ncol=3)

    # Take a sampling of indices to the rows
    indices <- sample(nrow(mat), 100, replace=F)

    # Splice the matrix
    mat.included <- mat[indices,]
    mat.leftover <- mat[-indices,]

    # Confirm everything is of proper size
    dim(mat)
    # [1] 150   3
    dim(mat.included)
    # [1] 100   3
    dim(mat.leftover)
    # [1] 50  3

示例：

{{1}}

Answer 3

所有数字：

x <- sample(10, 150, TRUE) # as an example

随机样本：

Combined <- sample(x,100)

其余数字：

xs <- sort(x) # sort the values of x
tab <- table(match(Combined, xs))
Remaining <- xs[-unlist(mapply(function(x, y) seq(y, length = x),
                               tab, as.numeric(names(tab))))]

请注意。如果x具有重复值，此解决方案也可以使用。

数据集的剩余变量

3 个答案:

根据您的评论进行更新。