Question

我想遍历一个大数据框，在第一列中计算> 0的值，删除那些被计数的行....然后移至第二列，计算> 0的数量并删除那些行，等等...

数据框

url1.ccsilab.com

可以用

生成

url1.ccsicloudsolutions.com

我希望结果会是

[acme]
email = “xxxx@yyy.com”
storage = “acme.json”
caServer = “https://acme-v01.api.letsencrypt.org/directory”
entryPoint = “https”
 [acme.httpChallenge]
 entryPoint = “http”

[[acme.domains]]
 main = “ccsilab.com”
 sans = [“url1.ccsilab.com”]

[[acme.domains]]
 main = “ccsicloudsolutions.com”
 sans = [“url1.ccsicloudsolutions.com”]

我写了这个循环，但是它并没有删除行

  taxonomy A B C
1      cat 0 2 0
2      dog 5 1 0
3    horse 3 0 0
4    mouse 0 0 4
5     frog 0 2 4
6     lion 0 0 2

它给出了不正确的结果

DF1 = structure(list(taxonomy = c("cat", "dog","horse","mouse","frog", "lion"),
                A = c(0L, 5L, 3L, 0L, 0L, 0L), D = c(2L, 1L, 0L, 0L, 2L, 0L), C = c(0L, 0L, 0L, 4L, 4L, 2L)), 
                .Names = c("taxonomy", "A", "B", "C"), 
                row.names = c(NA, -6L), class = "data.frame")

Answer 1

你可以做...

DF = DF1[, -1]
cond = DF != 0
p = max.col(cond, ties="first")
fp = factor(p, levels = seq_along(DF), labels = names(DF))
table(fp)

# A B C 
# 2 2 2

要考虑全部为零的行，我认为这可行：

fp[rowSums(cond) == 0] <- NA

Answer 2

我们可以在每次运行中更新数据集。创建一个没有“分类”列（“ tmp”）的临时数据集。启动named vector（'n'），循环遍历'tmp'的列，根据列是否大于0（'i1'）获得逻辑索引，获得{{ 1}}的TRUE值，更新相应列的'n'，然后通过使用'i1'作为行索引删除这些行来更新'tmp'

sum

也可以通过tmp <- DF1[-1] n <- setNames(numeric(ncol(tmp)), names(tmp)) for(i in seq_len(ncol(tmp))) { i1 <- tmp[[i]] > 0 n[i] <- sum(i1) tmp <- tmp[!i1, ]} n # A B C # 2 2 2

完成

Reduce

或使用sapply(Reduce(function(x, y) y[!x] > 0, DF1[3:4], init = DF1[,2] > 0, accumulate = TRUE ), sum) #[1] 2 2 2中的accumulate

purrr

Answer 3

使用Reduce和sapply很容易：

> first <- Reduce(function(a,b) b[a==0], df[-1], accumulate=TRUE)
> first
[[1]]
[1] 0 5 3 0 0 0

[[2]]
[1] 2 0 2 0

[[3]]
[1] 0 4 2

> then <- sapply(setNames(first, names(df[-1])), function(x) length(x[x>0]))
> then
A B C 
2 2 2

R循环遍历数据帧并计数大于一个值的值并删除行

3 个答案: