对于模糊的标题感到抱歉,但我不知道如何描述它。 我正在使用许多变量处理数据帧。
基本上,两个变量让我感兴趣: 一个是具有两个级别的因子向量(例如绿色,红色) 另一个是数字连续向量(例如:农药浓度)
AppleColour PesticidesConcentration
1 green 1.45
2 red 3.50
3 green 1.56
4 red 54.30
5 red 53.20
6 red 53.40
7 green 2.50
8 green 6.70
9 red 32.05
10 green 34.27
当农药> 20时,我想要计算绿色1)当农药> 4但是<50,2)时。
df1 <- structure(list(AppleColour = structure(c(1L, 2L, 1L, 2L, 2L,
2L, 1L, 1L, 2L, 1L), .Label = c("green", "red"), class = "factor"),
PesticidesConcentration = c(1.45, 3.5, 1.56, 54.3, 53.2,
53.4, 2.5, 6.7, 32.05, 34.27)), .Names = c("AppleColour",
"PesticidesConcentration"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))
答案 0 :(得分:2)
我们可以使用==
,&
,>
,<
创建逻辑向量,并获取TRUE值的sum
。
with(df1, sum(AppleColour=="green" &
PesticidesConcentration > 4 & PesticidesConcentration <50 &
!is.na(PesticidesConcentration)))
with(df1, sum(AppleColour == "green" &
PesticidesConcentration > 20 &
!is.na(PesticidesConcentration)))