Question

对于模糊的标题感到抱歉，但我不知道如何描述它。我正在使用许多变量处理数据帧。

基本上，两个变量让我感兴趣：一个是具有两个级别的因子向量（例如绿色，红色）另一个是数字连续向量（例如：农药浓度）

   AppleColour PesticidesConcentration
1        green                    1.45
2          red                    3.50
3        green                    1.56
4          red                   54.30
5          red                   53.20
6          red                   53.40
7        green                    2.50
8        green                    6.70
9          red                   32.05
10       green                   34.27

当农药> 20时，我想要计算绿色1）当农药> 4但是<50,2）时。

df1 <- structure(list(AppleColour = structure(c(1L, 2L, 1L, 2L, 2L, 
2L, 1L, 1L, 2L, 1L), .Label = c("green", "red"), class = "factor"), 
    PesticidesConcentration = c(1.45, 3.5, 1.56, 54.3, 53.2, 
    53.4, 2.5, 6.7, 32.05, 34.27)), .Names = c("AppleColour", 
"PesticidesConcentration"), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10"))

Answer 1

我们可以使用==，&，>，<创建逻辑向量，并获取TRUE值的sum。

with(df1, sum(AppleColour=="green" &
    PesticidesConcentration > 4 & PesticidesConcentration <50 &
             !is.na(PesticidesConcentration)))

with(df1, sum(AppleColour == "green" &
        PesticidesConcentration  > 20 & 
        !is.na(PesticidesConcentration)))

计算与基于级别因子相关联的行数，以限制另一个变量的最大值

1 个答案: