我想遍历一个大数据框,在第一列中计算> 0的值,删除那些被计数的行....然后移至第二列,计算> 0的数量并删除那些行,等等...
数据框
url1.ccsilab.com
可以用
生成url1.ccsicloudsolutions.com
我希望结果会是
[acme]
email = “xxxx@yyy.com”
storage = “acme.json”
caServer = “https://acme-v01.api.letsencrypt.org/directory”
entryPoint = “https”
[acme.httpChallenge]
entryPoint = “http”
[[acme.domains]]
main = “ccsilab.com”
sans = [“url1.ccsilab.com”]
[[acme.domains]]
main = “ccsicloudsolutions.com”
sans = [“url1.ccsicloudsolutions.com”]
我写了这个循环,但是它并没有删除行
taxonomy A B C
1 cat 0 2 0
2 dog 5 1 0
3 horse 3 0 0
4 mouse 0 0 4
5 frog 0 2 4
6 lion 0 0 2
它给出了不正确的结果
DF1 = structure(list(taxonomy = c("cat", "dog","horse","mouse","frog", "lion"),
A = c(0L, 5L, 3L, 0L, 0L, 0L), D = c(2L, 1L, 0L, 0L, 2L, 0L), C = c(0L, 0L, 0L, 4L, 4L, 2L)),
.Names = c("taxonomy", "A", "B", "C"),
row.names = c(NA, -6L), class = "data.frame")
答案 0 :(得分:4)
你可以做...
DF = DF1[, -1]
cond = DF != 0
p = max.col(cond, ties="first")
fp = factor(p, levels = seq_along(DF), labels = names(DF))
table(fp)
# A B C
# 2 2 2
要考虑全部为零的行,我认为这可行:
fp[rowSums(cond) == 0] <- NA
答案 1 :(得分:2)
我们可以在每次运行中更新数据集。创建一个没有“分类”列(“ tmp”)的临时数据集。启动named
vector
('n'),循环遍历'tmp'的列,根据列是否大于0('i1')获得逻辑索引,获得{{ 1}}的TRUE值,更新相应列的'n',然后通过使用'i1'作为行索引删除这些行来更新'tmp'
sum
也可以通过tmp <- DF1[-1]
n <- setNames(numeric(ncol(tmp)), names(tmp))
for(i in seq_len(ncol(tmp))) {
i1 <- tmp[[i]] > 0
n[i] <- sum(i1)
tmp <- tmp[!i1, ]}
n
# A B C
# 2 2 2
Reduce
或使用sapply(Reduce(function(x, y) y[!x] > 0, DF1[3:4],
init = DF1[,2] > 0, accumulate = TRUE ), sum)
#[1] 2 2 2
中的accumulate
purrr
答案 2 :(得分:1)
使用Reduce
和sapply
很容易:
> first <- Reduce(function(a,b) b[a==0], df[-1], accumulate=TRUE)
> first
[[1]]
[1] 0 5 3 0 0 0
[[2]]
[1] 2 0 2 0
[[3]]
[1] 0 4 2
> then <- sapply(setNames(first, names(df[-1])), function(x) length(x[x>0]))
> then
A B C
2 2 2