Question

我在使用大数据集

为此特定任务开发R代码时遇到困难

示例数据框如下所示：

   mon  abb Apr May Jun Jul Aug Sep Oct Nov
    5   May 2   4   2   5   0   0   7   0
    5   May 6   5   1   1   3   0   6   4
    5   May 3   1   0   1   1   2   8   8
    7   Jul 5   4   1   0   0   0   9   1
    7   Jul 3   3   4   3   4   4   9   9
    7   Jul 4   2   3   3   1   2   7   4
    7   Jul 4   1   4   2   3   5   4   3
    6   Jun 4   0   4   3   3   6   5   5
    7   Jul 4   4   5   3   4   8   8   8
    5   May 4   -1  6   4   4   9   5   4
    7   Jul 4   -2  4   4   2   6   6   9

对于与列名称匹配的列abb中的月份中的每一行，相应单元格中的数字将与成功的数字进行比较，并创建列count的次数小于它的次数其他单元格中的数字。希望它清楚

Output would look like
mon abb Apr May Jun Jul Aug Sep Oct Nov Count
5   May 2   4   2   5   0   0   7   0   2
5   May 6   5   1   1   3   0   6   4   1
5   May 3   1   0   1   1   2   8   8   3
7   Jul 5   4   1   0   0   0   9   1   2
7   Jul 3   3   4   3   4   4   9   9   4
7   Jul 4   2   3   3   1   2   7   4   2
7   Jul 4   1   4   2   3   5   4   3   4
6   Jun 4   0   4   3   3   6   5   5   3
7   Jul 4   4   5   3   4   8   8   8   4
5   May 4   -1  6   4   4   9   5   4   6
7   Jul 4   -2  4   4   2   6   6   9   3

我创建了列索引

conshead $ B =（匹配（conshead [，conshead $ monthabb]，colnames（conshead [24：31]））+ 23）

无法继续进行。请分享一个更好的逻辑。

Answer 1

以下是tidyverse的选项。创建一个序列列，其中rownames_to_column，gather数据集为'long'格式，按序列（'rn'）分组，slice'abb'等于的行通过获取逻辑表达式summarise来计算'key'，sum，即计算出匹配发生的第一个元素的值，并将其绑定为列在原始数据集（'df1'）

中

val[-1] > first(val)

library(tidyverse) rownames_to_column(df1, 'rn') %>% gather(key, val, Apr:Nov) %>% group_by(rn) %>% slice((which(abb == key) ): n()) %>% summarise(Count = sum(val[-1] > first(val))) %>% arrange(as.integer(rn)) %>% pull(Count) %>% bind_cols(df1, Count = .) # mon abb Apr May Jun Jul Aug Sep Oct Nov Count #1 5 May 2 4 2 5 0 0 7 0 2 #2 5 May 6 5 1 1 3 0 6 4 1 #3 5 May 3 1 0 1 1 2 8 8 3 #4 7 Jul 5 4 1 0 0 0 9 1 2 #5 7 Jul 3 3 4 3 4 4 9 9 4 #6 7 Jul 4 2 3 3 1 2 7 4 2 #7 7 Jul 4 1 4 2 3 5 4 3 4 #8 6 Jun 4 0 4 3 3 6 5 5 3 #9 7 Jul 4 4 5 3 4 8 8 8 4 #10 5 May 4 -1 6 4 4 9 5 4 6 #11 7 Jul 4 -2 4 4 2 6 6 9 3将使用行/列索引来提取元素，然后创建逻辑矩阵以获取base R

rowSums

对列索引不同的行进行逻辑检查

1 个答案: