根据字母顺序在R中过滤

时间:2017-04-24 18:06:41

标签: r filtering alphabetical

似乎我应该知道如何做到这一点。但基本上我有一个表有重复值的表,在一列中有差异。我搜索并发现了很多关于按字母顺序排序的问题,但没有按字母顺序过滤。

提前抱歉,我也无法弄清楚如何很好地格式化一些样本数据。

ResultID -Condition -nVariedSolute -t​​abscore5 -ItemPartID

644040 ---- LDoF ----- 2 ------------------ 2B ---------- 540000

644040 --- LDoF ----- 1 ----------------- 3B ---------- 540000

所以,我试图根据tabscore5的max(按字母顺序)值进行过滤。我使用split()发现的所有内容都假定它是一个数值。

我想保留整行,但只保留tabscore5中每个ResultID值最大值的行。

我认为这可能类似于

df %>% group_by(ResultID) %>% split(max(c(which.min(tabscore5))))

但我一直没有得到任何数据作为回应。我错过了什么?

下面我尝试使用dput(my_df)的输出作为@MikeH建议的用户,但我可能做错了。

    structure(list(ResultID = c(644040L, 644040L, 644043L, 644047L, 644047L, 644050L, 644050L, 644249L, 644251L, 644251L, 644252L, 644252L, 644259L, 644259L), Condition = structure(c(2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L,1L, 1L, 1L, 1L, 1L, 1L), .Label = c("HDoF", "LDoF"), class = "factor"), nVariedSolute = c(-1, 2, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 1), tabscore5 = c("3B", "2B", "1", "1", "2A", "2B", "3A", "1", "1", "2A", "1", "2A", "1", "2A"), Question = c("1 - DrinkMix_SometimesClaim",  "1 - DrinkMix_SometimesClaim", "1 - DrinkMix_SometimesClaim",  "1 - DrinkMix_SometimesClaim", "1 - DrinkMix_SometimesClaim",  "1 - DrinkMix_SometimesClaim", "1 - DrinkMix_SometimesClaim",  "1 - DrinkMix_SometimesClaim", "1 - DrinkMix_SometimesClaim",  "1 - DrinkMix_SometimesClaim", "1 - DrinkMix_SometimesClaim",  "1 - DrinkMix_SometimesClaim", "1 - DrinkMix_SometimesClaim",  "1 - DrinkMix_SometimesClaim"), ItemPartID = c(540000, 540000,  540000, 539941, 539941, 539941, 539941, 540000, 539941, 539941,  539941, 539941, 539941, 539941)), .Names = c("ResultID", "Condition",  "nVariedSolute", "tabscore5", "Question", "ItemPartID"), row.names = c(NA,  -14L), class = "data.frame")

1 个答案:

答案 0 :(得分:1)

  library(dplyr)
  df %>% 
  group_by(ResultID) %>%
  top_n(n = 1, wt =tabscore5)

#   ResultID Condition nVariedSolute tabscore5                    Question ItemPartID
#      <int>    <fctr>         <dbl>     <chr>                       <chr>      <dbl>
# 1   644040      LDoF            -1        3B 1 - DrinkMix_SometimesClaim     540000
# 2   644043      LDoF             1         1 1 - DrinkMix_SometimesClaim     540000
# 3   644047      HDoF             1        2A 1 - DrinkMix_SometimesClaim     539941
# 4   644050      HDoF             2        3A 1 - DrinkMix_SometimesClaim     539941
# 5   644249      LDoF             1         1 1 - DrinkMix_SometimesClaim     540000
# 6   644251      HDoF             1        2A 1 - DrinkMix_SometimesClaim     539941
# 7   644252      HDoF             1        2A 1 - DrinkMix_SometimesClaim     539941
# 8   644259      HDoF             1        2A 1 - DrinkMix_SometimesClaim     539941