如何计算r中特定行各自的特定列的计数

时间:2017-08-14 11:59:46

标签: r

我的数据如下:

enter image description here

我想分别计算每个位置的总通过次数,同样分别计算R中每个位置的失败总数。我怎样才能实现?

Here is the output of dput(head(data1)) PS:失败==失败lens_bubble

2 个答案:

答案 0 :(得分:1)

当数据格式整齐时,执行此类分析要容易得多。您可以详细了解here的含义。

首先,让我们创建一个数据示例。

library(tidyverse)
set.seed(327)

dta <- tibble(
  tool_number = 200:212,
  Err = sample(c("pass", "fail"), size = 13, replace = TRUE),

  pos01 = sample(c(0, 1), size = 13, replace = TRUE),
  pos02 = sample(c(0, 1), size = 13, replace = TRUE),
  pos03 = sample(c(0, 1), size = 13, replace = TRUE),
  pos04 = sample(c(0, 1), size = 13, replace = TRUE),
  pos05 = sample(c(0, 1), size = 13, replace = TRUE),
  pos06 = sample(c(0, 1), size = 13, replace = TRUE),
  pos07 = sample(c(0, 1), size = 13, replace = TRUE),
  pos08 = sample(c(0, 1), size = 13, replace = TRUE),
  pos09 = sample(c(0, 1), size = 13, replace = TRUE),
  pos10 = sample(c(0, 1), size = 13, replace = TRUE),
  date = sample(seq(
    as.Date("2017-01-01"), as.Date("2017-12-31"), by = "day"
  ), 13)
)

下一步使用gather()功能更改数据布局,使其整洁。

dta <- gather(dta, key = position, value = value, pos01:pos10)

现在,您可以使用group_by()summarise()函数查找每个位置的通过次数和失败次数。

dta %>% 
  group_by(Err, position) %>% 
  summarise(count = sum(value))

# # A tibble: 20 x 3
# # Groups:   Err [?]
# Err position count
# <chr>    <chr> <dbl>
#   1  fail    pos01     2
# 2  fail    pos02     1
# 3  fail    pos03     3
# 4  fail    pos04     0

如果您希望数据看起来更像您开始时的数据,则可以spread()结果。

dta %>% 
  group_by(Err, position) %>% 
  summarise(count = sum(value)) %>% 
  spread(key = Err, value = count)

# # A tibble: 10 x 3
# position  fail  pass
# *    <chr> <dbl> <dbl>
#   1    pos01     2     4
# 2    pos02     1     2
# 3    pos03     3     4
# 4    pos04     0     5

答案 1 :(得分:0)

我同意安德鲁关于整洁的数据,但基本的R解决方案将是

sapply(data1[, 3:12], function(x) sum(x[data1$Err == "pass"]))

sapply(data1[, 3:12], function(x) sum(x[data1$Err == "fail"]))