Question

我有一个如下数据框;

我想选择最大和最小概率＆＃39; ＆＃39;年的价值＆＃39; 2017年。无论哪个主题具有最大和最小概率值，这些主题的所有实例都必须收集在下面的另一个数据框中;

（在上面的例子中，主题V16在2017年的概率最小，V30具有最大概率）

Answer 1

我们可以使用tidyverse。如果我们需要获取“主题”的行，其中“概率”仅为2017年的“{1}”，那么

max/min

或使用library(dplyr) df1 %>% filter(topics %in% topics[probability == max(probability) & years == 2017]| topics %in% topics[probability == min(probability) & years == 2017]) # A tibble: 4 x 3 # Groups: years [2] # years topics probability # <int> <chr> <dbl> #1 2016 V10 0.0553 #2 2016 V15 0.0164 #3 2017 V30 0.0714 #4 2017 V16 0.0130

slice

或使用df1 %>% slice(c(which(topics %in% topics[probability == max(probability) & years == 2017]), which(topics %in% topics[probability == min(probability) & years == 2017]))) # A tibble: 4 x 3 # years topics probability # <int> <chr> <dbl> #1 2016 V30 0.0219 #2 2017 V30 0.0714 #3 2016 V16 0.0300 #4 2017 V16 0.0130

base R

数据

subset(df1, topics %in% subset(df1, years == 2017 & 
            probability %in% range(probability), select = "topics")[[1]])

Answer 2

你可以尝试

library(data.table)
a=setDT(df)[years==2017,topics[c(which.min(probability),which.max(probability))],by=years]
subset(df,topics%in%a$V1)

在基础r中，您可以执行以下操作：

a=aggregate(probability~years,subset(df,years==2017),function(x)c(which.max(x),which.min(x)))
subset(df,topics%in%topics[c(a$probability)])

r根据一列的最大值和最小值为变量选择多行

2 个答案:

数据