使用dplyr来创建共识列

时间:2019-05-15 09:47:23

标签: r dplyr

我有一个数据框:

//Add Pan Gesture on target view in viewDidLoad
        let panGesture = UILongPressGestureRecognizer(target: self, action: #selector(self.panGestureDetected))
        view.addGestureRecognizer(panGesture)

@objc func panGestureDetected()
    {
        print("Pan Gesture detected!!")
    }

,我想添加一个新列 Groups Name Category value G1 A cat1 20 G1 A cat2 1 G1 B cat3 21 G1 B cat3 23 G2 B cat4 32 G2 C cat2 23 G2 C cat2 21 ,例如:

consensus_category

这个想法是我有一个Groups Name Category value consensus_category G1 A cat1 20 cat2 G1 A cat2 1 cat2 G1 B cat3 21 cat2 G1 B cat3 23 cat2 G2 A cat4 32 cat4 G2 C cat2 23 cat4 G2 C cat2 21 cat4 ,它对应于数据框中的特定名称

,根据这个名称,我想为同一vector = c("A")中的所有其他行写对应的Category,但是如果两个Groups之间有ex-aequo ,则胜者为Categories最低的category。 (如:

Value

G1 A cat1 20 cat2 G1 A cat2 1 cat2 获胜,因为cat2

我尝试过:

1 < 20

但我不知道如何指定向量df %>% group_by(Groups) %>% add_count(Category) %>% top_n(1, n) %>% top_n(-1, Value) %>% distinct(consensus_category = Category) %>% right_join(df) 中的值作为共识指导。

2 个答案:

答案 0 :(得分:1)

使用dplyr,您可以找到组中具有Name的{​​{1}},获取最小值vec并从中提取相应的value。假设每个Category中至少要有一个Groups值。

vec

如果library(dplyr) vec <- "A" df %>% group_by(Groups) %>% mutate(consensus_category = Category[value == min(value[Name == vec])]) # Groups Name Category value consensus_category # <fct> <fct> <fct> <int> <fct> #1 G1 A cat1 20 cat2 #2 G1 A cat2 1 cat2 #3 G1 B cat3 21 cat2 #4 G1 B cat3 23 cat2 #5 G2 A cat4 32 cat4 #6 G2 C cat2 23 cat4 #7 G2 C cat2 21 cat4 中有多个值,则可能需要vec而不是Name %in% vec

数据

==

答案 1 :(得分:1)

带有data.table

的选项
library(data.table)
setDT(df)[, consensus_category := Category[value ==
      min(value[Name == vec])],  Groups]
df
#   Groups Name Category value consensus_category
#1:     G1    A     cat1    20               cat2
#2:     G1    A     cat2     1               cat2
#3:     G1    B     cat3    21               cat2
#4:     G1    B     cat3    23               cat2
#5:     G2    A     cat4    32               cat4
#6:     G2    C     cat2    23               cat4
#7:     G2    C     cat2    21               cat4

数据

df <- structure(list(Groups = c("G1", "G1", "G1", "G1", "G2", "G2", 
"G2"), Name = c("A", "A", "B", "B", "A", "C", "C"), Category = 
c("cat1", "cat2", "cat3", "cat3", "cat4", "cat2", "cat2"), value = 
c(20L, 1L, 21L, 23L, 32L, 23L, 21L)), class = "data.frame", row.names = 
c(NA, -7L))