具有多个因子的组(因子)数据。错误:不兼容的大小(0),期望1(组大小)或1

时间:2016-08-19 10:01:31

标签: r plyr

这篇文章是Changing line color in ggplot based on "several factors" slope

的后续内容

我想通过“PQ”对数据进行分组(下图),但是我收到以下错误:

  

“不兼容的大小(0),期望1(组大小)或1”

数据

ID<-c("A_P1","A_P1","A_P1","A_P1","A_P1","A_P2","A_P2","A_P2","A_P2","A_P2","A_P2","B_P1","B_P1","B_P1","B_P1","B_P1","B_P1","B_P1","B_P1","B_P2","B_P2","B_P2","B_P2","B_P2","B_P2","B_P2","B_P2")
Q<-c("C1","C1","C2","C3","C3","C1","C1","C2","C2","C3","C3","Q1","Q1","Q1","Q1","Q3","Q3","Q4","Q4","Q1","Q1","Q1","Q1","Q3","Q3","Q4","Q4")
PQ<-c("A_P1C1","A_P1C1","A_P1C2","A_P1C3","A_P1C3","A_P2C1","A_P2C1","A_P2C2","A_P2C2","A_P2C3","A_P2C3","B_P1Q1","B_P1Q1","B_P1Q1","B_P1Q1","B_P1Q3","B_P1Q3","B_P1Q4","B_P1Q4","B_P2Q1","B_P2Q1","B_P2Q1","B_P2Q1","B_P2Q3","B_P2Q3","B_P2Q4","B_P2Q4")
AS<-c("CF","CF","CF","CF","CF","CTF","CTF","CTF","CTF","CTF","CTF","CTF","CTF","CTF","CTF","CTF","CTF","CTF","CTF","CTF","CTF","CTF","CTF","CTF","CTF","CTF","CTF")
N<-c("N2","N3","N3","N2","N3","N2","N3","N2","N3","N2","N3","N0","N1","N2","N3","N1","N3","N0","N1","N0","N1","N2","N3","N1","N3","N0","N1")
Value<-c(4.7,8.61,8.34,5.89,8.36,1.76,2.4,5.01,2.12,1.88,3.01,2.4,7.28,4.34,5.39,11.61,10.14,3.02,9.45,8.8,7.4,6.93,8.44,7.37,7.81,6.74,8.5)

df<-data.frame(ID=ID,Q=Q,PQ=PQ,AS=AS,N=N,Value=Value)

提供错误的代码

#calculate slopes for N0 and N1
    df %>% 
      filter(N=="N0" | N=="N1") %>%
      group_by(PQ) %>%
      # use diff to calculate slope
      mutate(slope = diff(Value)) -> dat01

#calculate slopes for N0 and N2
    df %>% 
      filter(N=="N0" | N=="N2") %>%
      group_by(PQ) %>%
      # use diff to calculate slope
      mutate(slope = diff(Value)) -> dat02

此外,我想计算剩余的“PQ”因子(当存在时)的斜率,即N0-N3; N1-N2 ......等等

1 个答案:

答案 0 :(得分:1)

错误是由于diff的输出相对于原始数据集的长度不同。它返回一个小于原始数据的元素。因此,附加0或NA将解决问题

df %>% 
   filter(N=="N0" | N=="N1") %>%
   group_by(PQ) %>% 
   mutate(slope = c(0, diff(Value)))

要使其紧凑而不是==,我们可以在有多个元素时使用%in%

df %>%
   filter(N %in%  paste0("N", 0:1)) %>%
   group_by(PQ) %>%
   mutate(slope = c(0, diff(Value)))

关于第二个问题,关于为'N'中的所有组合执行此操作,请使用combn元素'{N}上的uniquefilter'N'为基础在组合值上,在按“PQ”分组后,计算“值”的diff。我们指定list时输出为simplify = FALSE

combn(as.character(unique(df$N)),2, FUN = function(x) df %>% 
            filter(N %in% x) %>% 
            group_by(PQ) %>%
            mutate(slope = c(0, diff(Value))), simplify = FALSE )