采用数据框 df ,我想根据以下每个的以下首选条件提取唯一 值字段:
1-如果存在 C1 ,则提取相应的值并忽略其他值
2-如果 C2 存在,则提取相应的值并忽略其他值
......等等 C5
数据:
df <- data.frame (Field=rep(c("F1","F2","F3","F4","F5"),each=3),
Cond=rep(c("C1","C2","C3","C4","C5"),3),
Value=c(1:15))
所需的输出:
output <- data.frame (F= c("F1","F2","F3","F4","F5"),
C= c("C1","C1","C2","C1","C3"),
Value= c(1,6,7,11,13))
(注1:值仅作为示例设置,实际数据值未订购)
(注意2:真实条件列根本不是按字母顺序排序的。我虽然有类似的东西,如果A存在而不是选择&#34; A值&#34;,否则传递给下一个条件&#34;如果B存在......&#34;等等)
答案 0 :(得分:2)
如果您可以在处理之前对data.frame进行排序,这非常简单。请注意,这适用于此特定情况。如果您的Cond
值发生变化,则字母排序可能会消失。
library(dplyr)
df <- data.frame (Field=rep(c("F1","F2","F3","F4","F5"),each=3),
Cond=rep(c("C1","C2","C3","C4","C5"),3),
Value=c(1:15))
df <- df[with(df, order(Field, Cond)), ]
res <- df %>%
group_by(Field) %>%
filter(row_number() == 1)
Source: local data frame [5 x 3]
Groups: Field [5]
Field Cond Value
<fctr> <fctr> <int>
1 F1 C1 1
2 F2 C1 6
3 F3 C2 7
4 F4 C1 11
5 F5 C3 13
这是另一种更为基本的方式。排序顺序在so
中定义(参见this question)。请注意我是如何破坏Cond
的值以表明它没有按字母顺序排序。
df <- data.frame (Field=rep(c("F1","F2","F3","F4","F5"),each=3),
Cond=rep(c("rg1","kl2","xy3","rq4","ab5"),3),
Value=c(1:15))
so <- c("rg1","kl2","xy3","rq4","ab5")
df %>%
group_by(Field) %>%
slice(match(so, Cond)) %>%
filter(row_number() == 1)
Field Cond Value
<fctr> <fctr> <int>
1 F1 rg1 1
2 F2 rg1 6
3 F3 kl2 7
4 F4 rg1 11
5 F5 xy3 13
答案 1 :(得分:1)
另一种选择是使用data.table
library(data.table)
setDT(df)[order(Field, Cond), head(.SD, 1), by = Field]
# Field Cond Value
#1: F1 C1 1
#2: F2 C1 6
#3: F3 C2 7
#4: F4 C1 11
#5: F5 C3 13