在循环中的两个条件上进行子集化

时间:2016-07-25 11:21:58

标签: r if-statement

我在R中有一个data.frame,我希望根据两个条件进行分组:首先,行不应该是重复的,其次如果它们是重复的,则只返回b == 1的行。而不是预期的五行,我得到了这个样本df的所有七行返回。原因是什么?

编辑:Sry,循环确实有效。我只是在[i]忘了df$b ..问题的第二部分,如何优化可以回答;)

a <- c(rep("A", 2), "B", rep("C",2), "D", "E")
b <- c("ws_12","dr_12","ws_12","ws_12","dr_12","ws_12","dr_12")
df <- data.frame(a,b)

result <- data.frame()
for (i in seq_along(df$a)) {
  if (duplicated2(df$a)[i] == FALSE) {
    result <- rbind(result, df[i,])
  } else if (duplicated2(df$a)[i] == TRUE && substring(df$b,1,2)[i] == "ws") {
    result <- rbind(result, df[i,])
  }
}

我是编程和R的新手,也许我有一些基本的错误。这也可以更简单的方式完成吗?

1 个答案:

答案 0 :(得分:0)

默认情况下,否定DECLARE @TBL TABLE (u NVARCHAR(50), d DATETIME, Score DECIMAL(10, 6)) INSERT INTO @TBL SELECT 'user01' u, '2016.07.08' d, 0.66667 SCORE union all select 'user01' u, '2016.07.08' d, 0.33333 SCORE union all select 'user01' u, '2016.07.08' d, -0.5 SCORE union all select 'user01' u, '2016.07.09' d, 0.33333 SCORE union all select 'user01' u, '2016.07.09' d, 0.66667 SCORE union all select 'user01' u, '2016.07.09' d, 1 SCORE union all select 'user01' u, '2016.07.10' d, 0.66667 SCORE union all select 'user01' u, '2016.07.10' d, 1 SCORE union all select 'user01' u, '2016.07.10' d, 0.5 SCORE union all select 'user02' u, '2016.07.08' d, 0.16667 SCORE union all select 'user02' u, '2016.07.08' d, -0.14286 SCORE union all select 'user02' u, '2016.07.08' d, 0.28571 SCORE union all select 'user02' u, '2016.07.10' d, 0.66667 SCORE union all select 'user02' u, '2016.07.10' d, 0.57143 SCORE ; with cte as ( select u.[user], d.[date] from (select distinct u as [user] from @TBL) as u cross join (select distinct d as [date] from @TBL) as d ) select cte.[USER], cte.[DATE], avg(isnull(raw.SCORE,0)) from cte left join @TBL as [raw] on raw.[u] = cte.[user] and raw.[d] = cte.date group by cte.[USER], cte.[DATE] Order by cte.[USER], cte.[DATE]; 选择第一个看到的行。因此,为了实现您的结果,我们可以在duplicatedorder并删除重复项,

b