使用R基于df中的计数阈值创建True / False列

时间:2016-03-09 17:33:13

标签: r if-statement dataframe

我有一个像这样的数据框

ID <- c("G110","G110","G110","G110","G110","G160","G160","G160",
        "G180","G180","G180","G180","G180","G190","G190","G190")
Measurement <- c("Length","Length","Length","Breadth","Breadth","Length","Breadth","Length",
                 "Length","Length","Length","Length","Length","Breadth","Breadth","Breadth")
Category <- c("A","A","A","A","A","B","B","B",
              "C","C","C","C","C","C","C","C")

我通过ID,测量和类别获得计数

  library(doBy)
  UniqueCategory <- summaryBy(Category~ID+Measurement+Category, data = df,
                                 FUN = function(x) { c(n = length(x)) } )

  UniqueCategory

        ID Measurement Category Category.n
    1 G110     Breadth        A          2
    2 G110      Length        A          3
    3 G160     Breadth        B          1
    4 G160      Length        B          2
    5 G180      Length        C          5
    6 G190     Breadth        C          3

现在,我有一个阈值,我想在这些计数上使用并在df中创建一个名为Output

的列
if A > 2, then df$Output is True else False 
if B > 1, then df$Output is True else False
if C > 4, then df$Output is True else False
df的

期望输出就像

     ID Measurement Category Output
1  G110      Length        A   True
2  G110      Length        A   True
3  G110      Length        A   True
4  G110     Breadth        A  False
5  G110     Breadth        A  False
6  G160      Length        B   True
7  G160     Breadth        B  False
8  G160      Length        B   True
9  G180      Length        C   True
10 G180      Length        C   True
11 G180      Length        C   True
12 G180      Length        C   True
13 G180      Length        C   True
14 G190     Breadth        C  False
15 G190     Breadth        C  False
16 G190     Breadth        C  False

我如何让这个工作?我试图使用if语句,但没有正确。请提供一些指示。

1 个答案:

答案 0 :(得分:1)

这是使用dplyr的解决方案:

library(dplyr)
df %>% 
group_by(ID , Measurement , Category) %>% 
mutate( Category.n = n() ) %>% 
      mutate( Output = ifelse( (Category == "A" & Category.n>2) | (Category == "B" & Category.n>1) | (Category == "C" & Category.n>4) , TRUE, FALSE))