Question

我有以下1000个观察结果的虚拟数据集：

obs <- 1000

df <- data.frame(
  a=c(1,0,0,0,0,1,0,0,0,0),
  b=c(0,1,0,0,0,0,1,0,0,0),
  c=c(0,0,1,0,0,0,0,1,0,0),
  d=c(0,0,0,1,0,0,0,0,1,0),
  e=c(0,0,0,0,1,0,0,0,0,1),
  f=c(10,2,4,5,2,2,1,2,1,4),
  g=sample(c("yes", "no"), obs, replace = TRUE),
  h=sample(letters[1:15], obs, replace = TRUE),
  i=sample(c("VF","FD", "VD"), obs, replace = TRUE),
  j=sample(1:10, obs, replace = TRUE)
)

此数据集的一个关键功能是变量a到e的值只有一个1，其余为0。我们确信这五列中只有一列的值为1。

我找到a way to extract these rows given a condition（带1）并分配给各自的变量：

df.a <- df[df[,"a"] == 1,,drop=FALSE]
df.b <- df[df[,"b"] == 1,,drop=FALSE]
df.c <- df[df[,"c"] == 1,,drop=FALSE]
df.d <- df[df[,"d"] == 1,,drop=FALSE]
df.e <- df[df[,"e"] == 1,,drop=FALSE]

我现在的困境是将保存到df.a的行限制为df.e并在之后合并它们。

Answer 1

要获取 n -rows子集，只需要一个简单的data[1:n,]即可。

df.a.sub <- df.a[1:10,]
df.b.sub <- df.b[1:10,]
df.c.sub <- df.c[1:10,]
df.d.sub <- df.d[1:10,]
df.e.sub <- df.e[1:10,]

最后，合并它们（花了大部分时间找到一个简单的“合并多个数据帧”，我需要做的就是rbind.fill(df1, df2, ..., dfn)感谢this question and answer）：
```
require(plyr)
df.merged <- rbind.fill(df.a.sub, df.b.sub, df.c.sub, df.d.sub, df.e.sub)
```

Answer 2

以下是创建df.merged的简短方法：

# variables of 'df'
vars <- c("a", "b", "c", "d", "e")

# number of rows to extract
n <- 100

df.merged <- do.call(rbind, lapply(vars, function(x) {
  head(df[as.logical(df[[x]]), ], n)
}))

在这里，rbind就足够了。如果您的数据框相对于列数不同，则必须使用函数rbind.fill。

在数据框中合并具有条件和限制的行

2 个答案: