比较R中连续行的值

时间:2018-02-09 17:03:05

标签: r if-statement rows

我有一个数据表,每个datatable$Ppt和每个datatable$nitem,当有" fffword"在datatable$Region中,我需要提取" fffword"的值。并将其与以下" word"的值进行比较。如果两个值相同,那么我需要在datatable$Output中使用值0,如果这两个值不同,我需要datatable$Output中的值为1。

我尝试过:

 datatable %>% group_by(Ppt, nitem) %>%
   mutate(Output = ifelse(as.numeric(gsub("fffword([0-9]+).*","\\1",Region) == lag(as.numeric(gsub("word([0-9]+).*","\\1",Region)), 0L,ifelse(as.numeric(gsub("fffword([0-9]+).*","\\1",Region) != lag(as.numeric(gsub("word([0-9]+).*","\\1",Region)), 1L)

但它没有用。

 #Ppt      Region            nitem      Output
 #1        "fffword8"        93         0 (current ffword n=8, following word n=8)
 #1        "word8"           93         0 (previous ffword n=8, current word n=8)
 #1        "fffword9"        93         1 (current ffword n=9, no following word for this ppt and this nitem)
 #1        "word2"           122        1 (no previous fffword for this ppt and this nitem and this n Region)
 #1        "fffword3"        122        0 (current ffword n=3, following word n=3)
 #1        "word3"           122        0 (previous ffword n=3, current word n=3)
 #1        "word6"           122        1 (no previous fffword for this ppt and this nitem and this n Region)
 #1        "fffword7"        122        0
 #1        "word7"           122        0
 #1        "fffword8"        122        0
 #1        "word8"           122        0
 #54       "fffword8"        4          0
 #54       "word8"           4          0
 #54       "fffword9"        4          1
 #54       "word2"           4          1
 #54       "fffword2"        10         0
 #54       "word4"           10         1
 #54       "word6"           10         1
 #54       "fffword23"       10         0
 #54       "word23"          10         0
 #54       "fffword24"       5          0
 #54       "word24"          5          0

1 个答案:

答案 0 :(得分:0)

嵌套ifelsedplyr的一次尝试可以为您提供所需的结果。方法是:

如果当前行的区域为fffword,则数字应与下一行(lead)匹配,但当前行的区域为word时,则应将数字与之前的数字进行比较({ {1}})行。

如果lagnext行不可用,则previous应被视为Output

来自Region的1刚刚被比较为digit格式。没有明显的理由在比较前转换character中的那些。

numeric