计算每组中的前3个元素时如何避免“只能对数字,逻辑或复杂类型进行运算”

时间:2019-07-08 16:54:23

标签: r matrix

我有一个包含4列的矩阵(AOD_median)。我想查找每年3个最大的数据元素(按AOD排序),并确定与这些元素相关的月份。这是我的数据:

date      month year     AOD
1-Mar-00    3   2000    0.226
1-Apr-00    4   2000    0.454
1-May-00    5   2000    0.328
1-Jun-00    6   2000    0.314
1-Jul-00    7   2000    0.354
1-Aug-00    8   2000    0.282
1-Sep-00    9   2000    0.278
1-Oct-00    10  2000    0.183
1-Nov-00    11  2000    0.173
1-Dec-00    12  2000    0.21
1-Jan-01    1   2001    0.171
1-Feb-01    2   2001    0.281
1-Mar-01    3   2001    0.241
1-Apr-01    4   2001    0.269
1-May-01    5   2001    0.292
1-Jun-01    6   2001    0.222
1-Jul-01    7   2001    0.322
1-Aug-01    8   2001    0.268
1-Sep-01    9   2001    0.276
1-Oct-01    10  2001    0.169
1-Nov-01    11  2001    0.16
1-Dec-01    12  2001    0.15

这是dput文本:

structure(list(X1 = c("1-Mar-00", "1-Apr-00", "1-May-00", "1-Jun-00", 
"1-Jul-00", "1-Aug-00", "1-Sep-00", "1-Oct-00", "1-Nov-00", "1-Dec-00", 
"1-Jan-01", "1-Feb-01", "1-Mar-01", "1-Apr-01", "1-May-01", "1-Jun-01", 
"1-Jul-01", "1-Aug-01", "1-Sep-01", "1-Oct-01", "1-Nov-01", "1-Dec-01"
), X2 = c(3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 
7, 8, 9, 10, 11, 12), X3 = c(2000, 2000, 2000, 2000, 2000, 2000, 
2000, 2000, 2000, 2000, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 
2001, 2001, 2001, 2001, 2001), X4 = c(0.226, 0.454, 0.328, 0.314, 
0.354, 0.282, 0.278, 0.183, 0.173, 0.21, 0.171, 0.281, 0.241, 
0.269, 0.292, 0.222, 0.322, 0.268, 0.276, 0.169, 0.16, 0.15)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -22L))

我尝试使用以下代码执行此操作:

for(i in 2000:2001) {(d <- as.matrix(AOD_median[which(AOD_median[,3]==i),]))&
                     (order_AOD <- d[order(d[,4], decreasing = TRUE)])&
                      print(order_AOD[1:3,2])}

我希望得到这样的结果:

"4" "7" "5" 
"7" "2" "9"

相反,我收到此错误:

Error in (d <- as.matrix(AOD_median[which(AOD_median[, 3] == i), ])) &  : 
  operations are possible only for numeric, logical or complex types

1 个答案:

答案 0 :(得分:0)

特定错误是由您使用&分隔代码行引起的;这不起作用,因为这是R中的逻辑运算符。您可以改用;或换行符来分隔行。

但是,您要退后一步,尝试计算数据集中每年的前3个月,以AOD字段为单位。由于您使用的是dplyr,因此可以使用以下类似的方法更顺利地完成此操作:

AOD_median %>%
  arrange(-AOD) %>%
  group_by(year) %>%
  top_n(3, AOD) %>%
  select(year, month)
# A tibble: 6 x 2
# Groups:   year [2]
#    year month
#   <dbl> <dbl>
# 1  2000     4
# 2  2000     7
# 3  2000     5
# 4  2001     7
# 5  2001     5
# 6  2001     2

如果您不介意三个月(按AOD排序)出现问题,则可以删除arrange(-AOD)行。

数据:

AOD_median <- structure(list(date = c("1-Mar-00", "1-Apr-00", "1-May-00", "1-Jun-00", "1-Jul-00", "1-Aug-00", "1-Sep-00", "1-Oct-00", "1-Nov-00", "1-Dec-00", "1-Jan-01", "1-Feb-01", "1-Mar-01", "1-Apr-01", "1-May-01", "1-Jun-01",  "1-Jul-01", "1-Aug-01", "1-Sep-01", "1-Oct-01", "1-Nov-01", "1-Dec-01"), month = c(3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6,7, 8, 9, 10, 11, 12), year = c(2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2001, 2001, 2001, 2001, 2001, 2001, 2001,2001, 2001, 2001, 2001, 2001), AOD = c(0.226, 0.454, 0.328, 0.314,0.354, 0.282, 0.278, 0.183, 0.173, 0.21, 0.171, 0.281, 0.241, 0.269, 0.292, 0.222, 0.322, 0.268, 0.276, 0.169, 0.16, 0.15)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -22L))