如何根据一组首选条件提取唯一元素

时间:2017-01-29 21:25:46

标签: r dataframe extraction data-extraction which

采用数据框 df ,我想根据以下每个的以下首选条件提取唯一 字段

1-如果存在 C1 ,则提取相应的值并忽略其他值

2-如果 C2 存在,则提取相应的值并忽略其他值

......等等 C5

数据:

df <- data.frame (Field=rep(c("F1","F2","F3","F4","F5"),each=3),
              Cond=rep(c("C1","C2","C3","C4","C5"),3),
              Value=c(1:15))

所需的输出

output <-  data.frame (F= c("F1","F2","F3","F4","F5"),
                   C= c("C1","C1","C2","C1","C3"),
                   Value= c(1,6,7,11,13))

(注1:仅作为示例设置,实际数据值未订购)

(注意2:真实条件列根本不是按字母顺序排序的。我虽然有类似的东西,如果A存在而不是选择&#34; A值&#34;,否则传递给下一个条件&#34;如果B存在......&#34;等等)

2 个答案:

答案 0 :(得分:2)

如果您可以在处理之前对data.frame进行排序,这非常简单。请注意,这适用于此特定情况。如果您的Cond值发生变化,则字母排序可能会消失。

library(dplyr)
df <- data.frame (Field=rep(c("F1","F2","F3","F4","F5"),each=3),
                  Cond=rep(c("C1","C2","C3","C4","C5"),3),
                  Value=c(1:15))

df <- df[with(df, order(Field, Cond)), ]
res <- df %>%
  group_by(Field) %>%
  filter(row_number() == 1)

Source: local data frame [5 x 3]
Groups: Field [5]

   Field   Cond Value
  <fctr> <fctr> <int>
1     F1     C1     1
2     F2     C1     6
3     F3     C2     7
4     F4     C1    11
5     F5     C3    13

这是另一种更为基本的方式。排序顺序在so中定义(参见this question)。请注意我是如何破坏Cond的值以表明它没有按字母顺序排序。

df <- data.frame (Field=rep(c("F1","F2","F3","F4","F5"),each=3),
                  Cond=rep(c("rg1","kl2","xy3","rq4","ab5"),3),
                  Value=c(1:15))

so <- c("rg1","kl2","xy3","rq4","ab5")

df %>%
  group_by(Field) %>%
  slice(match(so, Cond)) %>%
  filter(row_number() == 1)

   Field   Cond Value
  <fctr> <fctr> <int>
1     F1    rg1     1
2     F2    rg1     6
3     F3    kl2     7
4     F4    rg1    11
5     F5    xy3    13

答案 1 :(得分:1)

另一种选择是使用data.table

library(data.table)
setDT(df)[order(Field, Cond), head(.SD, 1), by = Field]
#    Field Cond Value
#1:    F1   C1     1
#2:    F2   C1     6
#3:    F3   C2     7
#4:    F4   C1    11
#5:    F5   C3    13